Re: AdvanceXLInsertBuffer vs. WAL segment compressibility

From: Chapman Flack <chap(at)anastigmatix(dot)net>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: AdvanceXLInsertBuffer vs. WAL segment compressibility
Date: 2016-07-26 13:31:50
Message-ID: 57976646.70502@anastigmatix.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 07/26/2016 08:48 AM, Amit Kapila wrote:

> general, if you have a very low WAL activity, then the final size of
> compressed WAL shouldn't be much even if you use gzip. It seems your

9.5 pg_xlog, low activity test cluster (segment switches forced
only by checkpoint timeouts), compressed with gzip -9:

$ for i in 0*; do echo -n "$i " && gzip -9 <$i | wc -c; done
000000010000000100000042 27072
000000010000000100000043 27075
000000010000000100000044 27077
000000010000000100000045 27073
000000010000000100000046 27075

Log from live pre-9.4 cluster, low-activity time of day, delta
compression using rsync:

2016-07-26 03:54:02 EDT (walship) INFO: using 2.39s user, 0.4s system,
9.11s on
wall:
231 byte 000000010000004600000029_000000010000004600000021_fwd
...
2016-07-26 04:54:01 EDT (walship) INFO: using 2.47s user, 0.4s system,
8.43s on
wall:
232 byte 00000001000000460000002A_000000010000004600000022_fwd
...
2016-07-26 05:54:02 EDT (walship) INFO: using 2.56s user, 0.29s system,
9.44s on
wall:
230 byte 00000001000000460000002B_000000010000004600000023_fwd

So when I say "factor of 100", I'm understating slightly. (Those
timings, for the curious, include sending a copy offsite via ssh.)

> everything zero. Now, it might be possible to selectively initialize
> the fields that doesn't harm the methodology for archive you are using
> considering there is no other impact of same in code. However, it

Indeed, it is only the one header field that duplicates the low-
order part of the (hex) file name that breaks delta compression,
because it has always been incremented even when nothing else is
different, and it's scattered 2000 times through the file.
Would it break anything for *that* to be zero in dummy blocks?

-Chap

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2016-07-26 14:07:03 Re: ispell/hunspell imprecision in error message
Previous Message Kevin Grittner 2016-07-26 13:24:10 Re: Proposal: revert behavior of IS NULL on row types