Re: Re: CRC

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Bruce Guenter <bruceg(at)em(dot)ca>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: Re: CRC
Date: 2000-12-09 03:17:00
Message-ID: 10174.976331820@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

A couple further observations while playing with this benchmark ---

1. This MD5 implementation is not too robust. On my machine it dumps
core if given a non-word-aligned data buffer. We could probably work
around that, but it bespeaks a little *too* much hand optimization...

2. It's a bad idea to ignore the startup/termination costs of the
algorithms. Of course startup/termination is trivial for CRC, but
it's not so trivial for MD5. I changed the code so that the md5
update() routine also calls md5_finish_ctx(), so that each inner
loop represents a complete MD5 calculation for a message of the
size of the main routine's fread buffer. I then experimented with
different buffer sizes. At a buffer size of 1K:

time benchcrc <random32

real 35.4
user 35.1
sys 0.0
time benchmd5 <random32

real 41.4
user 41.1
sys 0.0

At a buffer size of 100 bytes:

time benchcrc <random32

real 36.3
user 36.0
sys 0.0
time benchmd5 <random32

real 1:09.7
user 1:09.2
sys 0.0

(The total amount of data processed is 1000 MB in either case, but
it's divided into more messages in the second case.)

I'm not sure exactly what Vadim has in mind for computing CRCs on the
WAL log. If he's thinking of a CRC for each log message, the MD5 stuff
would be at a definite disadvantage. For disk-page checksums (8K or
more) this isn't too much of an issue, however.

regards, tom lane

In response to

Responses

  • Re: Re: CRC at 2000-12-10 05:37:42 from Bruce Guenter

Browse pgsql-hackers by date

  From Date Subject
Next Message selkovjr 2000-12-09 03:22:02 Re: Indexing for geographic objects?
Previous Message Horst Herb 2000-12-09 02:58:29 CRC, hash & Co.