Re: [REVIEW] Re: Compression of full-page-writes

From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Arthur Silva <arthurprs(at)gmail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Rahila Syed <rahilasyed(dot)90(at)gmail(dot)com>, Ants Aasma <ants(at)cybertec(dot)at>, Rahila Syed <rahilasyed90(at)gmail(dot)com>, Abhijit Menon-Sen <ams(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "ktm(at)rice(dot)edu" <ktm(at)rice(dot)edu>
Subject: Re: [REVIEW] Re: Compression of full-page-writes
Date: 2014-09-15 10:57:27
Message-ID: 5416C617.3030405@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 09/15/2014 02:42 AM, Arthur Silva wrote:
> Em 14/09/2014 12:21, "Andres Freund" <andres(at)2ndquadrant(dot)com> escreveu:
>>
>> On 2014-09-13 20:27:51 -0500, ktm(at)rice(dot)edu wrote:
>>>>
>>>> What we are looking for here is uniqueness thus better error
> detection. Not
>>>> avalanche effect, nor cryptographically secure, nor bit distribution.
>>>> As far as I'm aware CRC32C is unbeaten collision wise and time proven.
>>>>
>>>> I couldn't find tests with xxhash and crc32 on the same hardware so I
> spent
>>>> some time putting together a benchmark (see attachment, to run it just
>>>> start run.sh)
>>>>
>>>> I included a crc32 implementation using ssr4.2 instructions (which
> works on
>>>> pretty much any Intel processor built after 2008 and AMD built after
> 2012),
>>>> a portable Slice-By-8 software implementation and xxhash since it's
> the
>>>> fastest software 32bit hash I know of.
>>>>
>>>> Here're the results running the test program on my i5-4200M
>>>>
>>>> crc sb8: 90444623
>>>> elapsed: 0.513688s
>>>> speed: 1.485220 GB/s
>>>>
>>>> crc hw: 90444623
>>>> elapsed: 0.048327s
>>>> speed: 15.786877 GB/s
>>>>
>>>> xxhash: 7f4a8d5
>>>> elapsed: 0.182100s
>>>> speed: 4.189663 GB/s
>>>>
>>>> The hardware version is insanely and works on the majority of Postgres
>>>> setups and the fallback software implementations is 2.8x slower than
> the
>>>> fastest 32bit hash around.
>>>>
>>>> Hopefully it'll be useful in the discussion.
>>
>> Note that all these numbers aren't fully relevant to the use case
>> here. For the WAL - which is what we're talking about and the only place
>> where CRC32 is used with high throughput - the individual parts of a
>> record are pretty darn small on average. So performance of checksumming
>> small amounts of data is more relevant. Mind, that's not likely to go
>> for CRC32, especially not slice-by-8. The cache fooprint of the large
>> tables is likely going to be noticeable in non micro benchmarks.
>
> Indeed, the small input sizes is something I was missing. Something more
> cache friendly would be better, it's just a matter of finding a better
> candidate.

It's worth noting that the extra tables that slicing-by-4 requires are
and *in addition to* the lookup table we already have. And slicing-by-8
builds on the slicing-by-4 lookup tables. Our current algorithm uses a
1kB lookup table, slicing-by-4 a 4kB, and slicing-by-8 8kB. But the
first 1kB of the slicing-by-4 lookup table is identical to the current
1kB lookup table, and the first 4kB of the slicing-by-8 are identical to
the slicing-by-4 tables.

It would be pretty straightforward to use the current algorithm when the
WAL record is very small, and slicing-by-4 or slicing-by-8 for larger
records (like FPWs), where the larger table is more likely to pay off. I
have no idea where the break-even point is with the current algorithm
vs. slicing-by-4 and a cold cache, but maybe we can get a handle on that
with some micro-benchmarking.

Although this is complicated by the fact that slicing-by-4 or -8 might
well be a win even with very small records, if you generate a lot of them.

- Heikki

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2014-09-15 11:00:56 Re: B-Tree support function number 3 (strxfrm() optimization)
Previous Message Alexey Klyukin 2014-09-15 10:44:50 Re: implement subject alternative names support for SSL connections