Re: pg_basebackup's --gzip switch misbehaves

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: pg_basebackup's --gzip switch misbehaves
Date: 2022-09-04 05:20:52
Message-ID: YxQ1tIYF8hbrNO7q@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Sep 03, 2022 at 11:11:29AM -0400, Tom Lane wrote:
> It makes sense that base.tar.gz is compressed a little better with
> --gzip than with level-1 compression, but why is pg_wal.tar.gz not
> compressed at all? It looks like the problem probably boils down to
> which of "-1" and "0" means "default behavior" vs "no compression",
> with different code layers interpreting that differently. I can't
> find exactly where that's happening, but I did manage to stop the
> failures with this crude hack:

There is a distinction coming in pg_basebackup.c from the way we
deparse the compression specification and the default compression
level that should be assigned if there is no level directly specified
by the user. It seems to me that the error comes from this code in
BaseBackup() when we are under STREAM_WAL (default):

if (client_compress->algorithm == PG_COMPRESSION_GZIP)
{
wal_compress_algorithm = PG_COMPRESSION_GZIP;
wal_compress_level =
(client_compress->options & PG_COMPRESSION_OPTION_LEVEL)
!= 0 ? client_compress->level : 0;

ffd5365 has missed that wal_compress_level should be set to
Z_DEFAULT_COMPRESSION if there is nothing set in the compression
spec for a zlib build. pg_receivewal.c enforces that already.

> That's not right as a real fix, because it would have the effect
> that "--compress gzip:0" would also invoke default compression,
> whereas what it should do is produce the uncompressed output
> we're actually getting. Both cases have compression_level == 0
> by the time we reach here, though.

Nope, that would not be right.

> BTW, I'm fairly astonished that anyone would have thought that three
> complete pg_basebackup cycles testing essentially-identical options
> were a good use of developer time and buildfarm cycles from here to
> eternity. Even if digging into it did expose a bug, the test case
> deserves little credit for that, because it entirely failed to call
> attention to the problem. I had to whack the script pretty hard
> just to get it to not delete the evidence.

The introduction of the compression specification has introduced a lot
of patterns where we expect or not expect compression to happen, and
on top of that this needs to be careful about backward-compatibility.
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2022-09-04 05:40:22 Re: freeing LDAPMessage in CheckLDAPAuth
Previous Message John Naylor 2022-09-04 05:16:10 Re: build remaining Flex files standalone