| From: | Simon Riggs <simon(at)2ndquadrant(dot)com> |
|---|---|
| To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
| Cc: | pgsql-hackers(at)postgreSQL(dot)org |
| Subject: | Re: Quick-and-dirty compression for WAL backup blocks |
| Date: | 2005-05-31 23:32:47 |
| Message-ID: | 1117582367.3844.805.camel@localhost.localdomain |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Tue, 2005-05-31 at 16:26 -0400, Tom Lane wrote:
> The TODO item that comes to mind immediately is "Compress WAL entries".
> A more concrete version of this is: examine the page to see if the
> pd_lower field is between SizeOfPageHeaderData and BLCKSZ, and if so
> whether there is a run of consecutive zero bytes beginning at the
> pd_lower position. Omit any such bytes from what is written to WAL.
> (This definition ensures that nothing goes wrong if the page does not
> follow the normal page layout conventions: the transformation is
> lossless no matter what, since we can always reconstruct the exact page
> contents.) The overhead needed is only 2 bytes to show the number of
> bytes removed.
>
> The other alternatives that were suggested included running the page
> contents through the same compressor used for TOAST, and implementing
> a general-purpose run-length compressor that could get rid of runs of
> zeroes anywhere on the page. However, considering that the compression
> work has to be done while holding WALInsertLock, it seems to me there
> is a strong premium on speed. I think that lets out the TOAST
> compressor, which isn't amazingly speedy. (Another objection to the
> TOAST compressor is that it certainly won't win on already-compressed
> toasted data.) A run-length compressor would be reasonably quick but
> I think that the omit-the-middle-hole approach gets most of the possible
> win with even less work. In particular, I think it can be proven that
> omit-the-hole will actually require less CPU than now, since counting
> zero bytes should be strictly faster than CRC'ing bytes, and we'll be
> able to save the CRC work on whatever bytes we omit.
>
> Any objections?
None: completely agree with your analysis. Sounds great.
> It seems we are more or less agreed that 32-bit CRC ought to be enough
> for WAL; and we also need to make a change to ensure that backup blocks
> are positively linked to their parent WAL record, as I noted earlier
> today. So as long as we have to mess with the WAL record format, I was
> wondering what else we could get done in the same change.
Is this a change that would be backpatched as you suggested previously?
It seems a rather large patch to change three things at once. Can the
backpatch wait until 8.1 has gone through beta to allow the changes to
be proven?
Best Regards, Simon Riggs
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Simon Riggs | 2005-05-31 23:57:19 | Re: Tablespace-level Block Size Definitions |
| Previous Message | Simon Riggs | 2005-05-31 23:19:12 | Re: Cost of XLogInsert CRC calculations |