Re: [REVIEW] Re: Compression of full-page-writes

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Ants Aasma <ants(at)cybertec(dot)at>
Cc: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Rahila Syed <rahilasyed90(at)gmail(dot)com>, Abhijit Menon-Sen <ams(at)2ndquadrant(dot)com>, Rahila Syed <rahilasyed(dot)90(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [REVIEW] Re: Compression of full-page-writes
Date: 2014-09-12 20:27:49
Message-ID: 20140912202749.GB23806@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2014-09-12 23:17:12 +0300, Ants Aasma wrote:
> On Fri, Sep 12, 2014 at 10:38 PM, Heikki Linnakangas
> <hlinnakangas(at)vmware(dot)com> wrote:
> > I don't mean that we should abandon this patch - compression makes the WAL
> > smaller which has all kinds of other benefits, even if it makes the raw TPS
> > throughput of the system worse. But I'm just saying that these TPS
> > comparisons should be taken with a grain of salt. We probably should
> > consider switching to a faster CRC algorithm again, regardless of what we do
> > with compression.
>
> CRC is a pretty awfully slow algorithm for checksums. We should
> consider switching it out for something more modern. CityHash,
> MurmurHash3 and xxhash look like pretty good candidates, being around
> an order of magnitude faster than CRC. I'm hoping to investigate
> substituting the WAL checksum algorithm 9.5.

I think that might not be a bad plan. But it'll involve *far* more
effort and arguing to change to fundamentally different algorithms. So
personally I'd just go with slice-by-4. that's relatively
uncontroversial I think. Then maybe switch the polynom so we can use the
CRC32 instruction.

> Given the room for improvement in this area I think it would make
> sense to just short-circuit the CRC calculations for testing this
> patch to see if the performance improvement is due to less data being
> checksummed.

FWIW, I don't think it's 'bad' that less data provides speedups. I don't
really see a need to see that get that out of the benchmarks.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Korotkov 2014-09-12 20:28:50 Re: jsonb contains behaviour weirdness
Previous Message Robert Haas 2014-09-12 20:24:27 Re: bad estimation together with large work_mem generates terrible slow hash joins