Re: What exactly is our CRC algorithm?

From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Abhijit Menon-Sen <ams(at)2ndQuadrant(dot)com>, <pgsql-hackers(at)postgresql(dot)org>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: What exactly is our CRC algorithm?
Date: 2015-01-02 14:46:29
Message-ID: 54A6AF45.9050905@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 01/01/2015 09:17 AM, Abhijit Menon-Sen wrote:
> Hi.
>
> OK, here are the patches with the various suggestions applied.
>
> I found that the alignment didn't seem to make much difference for the
> CRC32* instructions, so I changed to process (len/8)*8bytes followed by
> (len%8)*1bytes, the way the Linux kernel does.

Ok.

In the slicing-by-8 version, I wonder if it would be better to do
single-byte loads to c0-c7, instead of two 4-byte loads and shifts.
4-byte loads are presumably faster than single byte loads, but then
you'd avoid the shifts. And then you could go straight into the
8-bytes-at-a-time loop, without the initial single-byte processing to
get the start address aligned. (the Linux implementation doesn't do
that, so maybe it's a bad idea, but might be worth testing..)

Looking at the Linux implementation, I think it only does the bswap once
per call, not inside the hot loop. Would it even make sense to keep the
crc variable in different byte order, and only do the byte-swap once in
END_CRC32() ?

The comments need some work. I note that there is no mention of the
slicing-by-8 algorithm anywhere in the comments (in the first patch).

Instead of checking for "defined(__GNUC__) || defined(__clang__)",
should add an explicit configure test for __builtin_bswap32().

- Heikki

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kevin Grittner 2015-01-02 15:04:19 Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]
Previous Message Amit Kapila 2015-01-02 14:18:45 Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]