Re: Quick-and-dirty compression for WAL backup blocks

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Mark Cave-Ayland" <m(dot)cave-ayland(at)webbased(dot)co(dot)uk>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Quick-and-dirty compression for WAL backup blocks
Date: 2005-06-04 15:46:07
Message-ID: 26740.1117899967@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

"Mark Cave-Ayland" <m(dot)cave-ayland(at)webbased(dot)co(dot)uk> writes:
>> A run-length compressor would be reasonably quick but I think that the
>> omit-the-middle-hole approach gets most of the possible win with even
>> less work.

> I can't think that a RLE scheme would be much more expensive than a 'count
> the hole' approach with more benefit, so I wouldn't like to discount this
> straight away...

RLE would require scanning the whole page with no certainty of win,
whereas count-the-hole is a certain win since you only examine bytes
that are potentially removable from the later CRC calculation.

> If you do manage to go ahead with the code, I'd be very interested to see
> some comparisons in bytes written to XLog for old and new approaches for
> some inserts/updates. Perhaps we could ask Mark to run another TPC benchmark
> at OSDL when this and the CRC changes have been completed.

I've completed a test run for this (it's essentially MySQL's sql-bench
done immediately after initdb). What I get is:

CVS tip of 6/1: ending WAL offset = 0/A364A780 = 2741282688 bytes written

CVS tip of 6/2: ending WAL offset = 0/8BB091DC = 2343604700 bytes written

or about a 15% savings. This is with a checkpoint_segments setting of 30.
One can presume that the savings would be larger at smaller checkpoint
intervals and smaller at larger intervals, but I didn't try more than
one set of test conditions.

I'd say that's an improvement worth having, especially considering that
it requires no net expenditure of CPU time. But the table is certainly
still open to discuss more complicated approaches.

regards, tom lane

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2005-06-04 15:49:11 Re: Implement support for TCP_KEEPCNT, TCP_KEEPIDLE, TCP_KEEPINTVL
Previous Message Bruce Momjian 2005-06-04 15:45:46 Re: pgsql: Fix NUMERIC modulus to properly truncate