Re: [REVIEW] Re: Compression of full-page-writes

From: "ktm(at)rice(dot)edu" <ktm(at)rice(dot)edu>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Ants Aasma <ants(at)cybertec(dot)at>, Arthur Silva <arthurprs(at)gmail(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Rahila Syed <rahilasyed(dot)90(at)gmail(dot)com>, Rahila Syed <rahilasyed90(at)gmail(dot)com>, Abhijit Menon-Sen <ams(at)2ndquadrant(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Subject: Re: [REVIEW] Re: Compression of full-page-writes
Date: 2014-09-13 19:32:05
Message-ID: 20140913193205.GA24489@aart.rice.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Sep 13, 2014 at 12:55:33PM -0400, Tom Lane wrote:
> Andres Freund <andres(at)2ndquadrant(dot)com> writes:
> > On 2014-09-13 08:52:33 +0300, Ants Aasma wrote:
> >> On Sat, Sep 13, 2014 at 6:59 AM, Arthur Silva <arthurprs(at)gmail(dot)com> wrote:
> >>> That's not entirely true. CRC-32C beats pretty much everything with the same
> >>> length quality-wise and has both hardware implementations and highly
> >>> optimized software versions.
>
> >> For better or for worse CRC is biased by detecting all single bit
> >> errors, the detection capability of larger errors is slightly
> >> diminished. The quality of the other algorithms I mentioned is also
> >> very good, while producing uniformly varying output.
>
> > There's also much more literature about the various CRCs in comparison
> > to some of these hash allgorithms.
>
> Indeed. CRCs have well-understood properties for error detection.
> Have any of these new algorithms been analyzed even a hundredth as
> thoroughly? No. I'm unimpressed by evidence-free claims that
> something else is "also very good".
>
> Now, CRCs are designed for detecting the sorts of short burst errors
> that are (or were, back in the day) common on phone lines. You could
> certainly make an argument that that's not the type of threat we face
> for PG data. However, I've not seen anyone actually make such an
> argument, let alone demonstrate that some other algorithm would be better.
> To start with, you'd need to explain precisely what other error pattern
> is more important to defend against, and why.
>
> regards, tom lane
>

Here is a blog on the development of xxhash:

http://fastcompression.blogspot.com/2012/04/selecting-checksum-algorithm.html

Regards,
Ken

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2014-09-13 20:28:55 Re: B-Tree support function number 3 (strxfrm() optimization)
Previous Message Atri Sharma 2014-09-13 18:35:18 Re: Postgres code for a query intermediate dataset