Re: AdvanceXLInsertBuffer vs. WAL segment compressibility

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Chapman Flack <chap(at)anastigmatix(dot)net>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: AdvanceXLInsertBuffer vs. WAL segment compressibility
Date: 2016-07-26 02:09:58
Message-ID: CAB7nPqQ5NS5E-rWzd76MQHBEoc5cGHjSrUs4W6EiTxwWEx338Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jul 25, 2016 at 11:21 PM, Chapman Flack <chap(at)anastigmatix(dot)net> wrote:
> The impression that leaves is of tools that relied too heavily
> on internal format knowledge to be viable outside of core, which
> have had at least periods of incompatibility with newer PG versions,
> and whose current status, if indeed any are current, isn't easy
> to find out.

WAL format has gone through a lot of changes in 9.4 as well. 9.3 has
as well introduced xlogreader.c which is what *any* client trying to
read WAL into an understandable format should use.

> And that, I assume, was also the motivation to put the zeroing
> in AdvanceXLInsertBuffer, eliminating the need for one narrow,
> specialized tool like pg{_clear,_compress,less}log{,tail}, so
> the job can be done with ubiquitous, bog standard (and therefore
> *very* exhaustively tested) tools like gzip.

Exactly, and honestly this has been a huge win to make such segments
more compressible.

> Even so, it still seems to me that a cheaper solution is a %e
> substitution in archive_command: just *tell* the command where
> the valid bytes end. Accomplishes the same thing as ~ 16 MB
> of otherwise-unnecessary I/O at the time of archiving each
> lightly-used segment.
>
> Then the actual zeroing could be suppressed to save I/O, maybe
> with a GUC variable, or maybe just when archive_command is seen
> to contain a %e. Commands that don't have a %e continue to work
> and compress effectively because of the zeroing.

This is over-complicating things for little gain. The new behavior of
filling in with zeros the tail of a segment makes things far better
when using gzip in archive_command.
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2016-07-26 02:54:39 Re: Curing plpgsql's memory leaks for statement-lifespan values
Previous Message Kyotaro HORIGUCHI 2016-07-26 02:05:27 Re: Constraint merge and not valid status