Re: [REVIEW] Re: Compression of full-page-writes

From: Arthur Silva <arthurprs(at)gmail(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Rahila Syed <rahilasyed(dot)90(at)gmail(dot)com>, Ants Aasma <ants(at)cybertec(dot)at>, Rahila Syed <rahilasyed90(at)gmail(dot)com>, Abhijit Menon-Sen <ams(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, "ktm(at)rice(dot)edu" <ktm(at)rice(dot)edu>
Subject: Re: [REVIEW] Re: Compression of full-page-writes
Date: 2014-09-14 23:42:36
Message-ID: CAO_YK0W172yERPUQrBMNgqbCrBOZpTy_8t9ydSboi4gwGr6Dtg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Em 14/09/2014 12:21, "Andres Freund" <andres(at)2ndquadrant(dot)com> escreveu:
>
> On 2014-09-13 20:27:51 -0500, ktm(at)rice(dot)edu wrote:
> > >
> > > What we are looking for here is uniqueness thus better error
detection. Not
> > > avalanche effect, nor cryptographically secure, nor bit distribution.
> > > As far as I'm aware CRC32C is unbeaten collision wise and time proven.
> > >
> > > I couldn't find tests with xxhash and crc32 on the same hardware so I
spent
> > > some time putting together a benchmark (see attachment, to run it just
> > > start run.sh)
> > >
> > > I included a crc32 implementation using ssr4.2 instructions (which
works on
> > > pretty much any Intel processor built after 2008 and AMD built after
2012),
> > > a portable Slice-By-8 software implementation and xxhash since it's
the
> > > fastest software 32bit hash I know of.
> > >
> > > Here're the results running the test program on my i5-4200M
> > >
> > > crc sb8: 90444623
> > > elapsed: 0.513688s
> > > speed: 1.485220 GB/s
> > >
> > > crc hw: 90444623
> > > elapsed: 0.048327s
> > > speed: 15.786877 GB/s
> > >
> > > xxhash: 7f4a8d5
> > > elapsed: 0.182100s
> > > speed: 4.189663 GB/s
> > >
> > > The hardware version is insanely and works on the majority of Postgres
> > > setups and the fallback software implementations is 2.8x slower than
the
> > > fastest 32bit hash around.
> > >
> > > Hopefully it'll be useful in the discussion.
>
> Note that all these numbers aren't fully relevant to the use case
> here. For the WAL - which is what we're talking about and the only place
> where CRC32 is used with high throughput - the individual parts of a
> record are pretty darn small on average. So performance of checksumming
> small amounts of data is more relevant. Mind, that's not likely to go
> for CRC32, especially not slice-by-8. The cache fooprint of the large
> tables is likely going to be noticeable in non micro benchmarks.
>

Indeed, the small input sizes is something I was missing. Something more
cache friendly would be better, it's just a matter of finding a better
candidate.

Although I find it highly unlikely that the 4kb extra table of sb8 brings
its performance down to sb4 level, even considering the small inputs and
cache misses.

For what's worth mysql, cassandra, kafka, ext4, xfx all use crc32c
checksums in their WAL/Journals.

> > Also, while I understand that CRC has a very venerable history and
> > is well studied for transmission type errors, I have been unable to find
> > any research on its applicability to validating file/block writes to a
> > disk drive.
>
> Which incidentally doesn't really match what the CRC is used for
> here. It's used for individual WAL records. Usually these are pretty
> small, far smaller than disk/postgres' blocks on average. There's a
> couple scenarios where they can get large, true, but most of them are
> small.
> The primary reason they're important is to correctly detect the end of
> the WAL. To ensure we're interpreting half written records, or records
> from before the WAL file was overwritten.
>
>
> > While it is to quote you "unbeaten collision wise", xxhash,
> > both the 32-bit and 64-bit version are its equal.
>
> Aha? You take that from the smhasher results?
>
> > Since there seems to be a lack of research on disk based error
> > detection versus CRC polynomials, it seems likely that any of the
> > proposed hash functions are on an equal footing in this regard. As
> > Andres commented up-thread, xxhash comes along for "free" with lz4.
>
> This is pure handwaving.
>
> Greetings,
>
> Andres Freund
>
> --
> Andres Freund http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Emanuel Calvo 2014-09-15 01:08:31 Re: BRIN indexes (was Re: Minmax indexes)
Previous Message Petr Jelinek 2014-09-14 23:38:52 Re: Sequence Access Method WIP