Re: Add LZ4 compression in pg_dump

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: gkokolatos(at)pm(dot)me
Cc: Justin Pryzby <pryzby(at)telsasoft(dot)com>, Rachel Heaton <rachelmheaton(at)gmail(dot)com>, Greg Stark <stark(at)mit(dot)edu>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Add LZ4 compression in pg_dump
Date: 2022-03-30 05:54:34
Message-ID: YkPwmndklVKG4PE+@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 29, 2022 at 09:46:27AM +0000, gkokolatos(at)pm(dot)me wrote:
> On Tuesday, March 29th, 2022 at 9:27 AM, Michael Paquier <michael(at)paquier(dot)xyz> wrote:
>> On Sat, Mar 26, 2022 at 01:14:41AM -0500, Justin Pryzby wrote:
>> Wow. This stuff is old enough to vote (c3e18804), dead since its
>> introduction. There is indeed an argument for removing that, it is
>> not good to keep around that that has never been stressed and/or
>> used. Upon review, the cleanup done looks correct, as we have never
>> been able to generate .dat.gz files in for a dump in the tar format.
>
> Correct. My driving force behind it was to ease up the cleanup/refactoring
> work that follows, by eliminating the callers of the GZ*() macros.

Makes sense to me.

>> + command_fails_like(
>>
>> + [ 'pg_dump', '--compress', '1', '--format', 'tar' ],
>> This addition depending on HAVE_LIBZ is a good thing as a reminder of
>> any work that could be done in 0002. Now that's waiting for 20 years
>> so I would not hold my breath on this support. I think that this
>> could be just applied first, with 0002 on top of it, as a first
>> improvement.
>
> Excellent, thank you.

I have applied the test for --compress and --format=tar, separating it
from the rest.

While moving on with 0002, I have noticed the following in
_StartBlob():
if (AH->compression != 0)
sfx = ".gz";
else
sfx = "";

Shouldn't this bit also be simplified, adding a fatal() like the other
code paths, for safety?

>> + compress_cmd => [
>> + $ENV{'GZIP_PROGRAM'},
>> + '-f',
>> [...]
>> + $ENV{'GZIP_PROGRAM'},
>> + '-k', '-d',
>> -f and -d are available everywhere I looked at, but is -k/--keep a
>> portable choice with a gzip command? I don't see this option in
>> OpenBSD, for one. So this test is going to cause problems on those
>> buildfarm machines, at least. Couldn't this part be replaced by a
>> simple --test to check that what has been compressed is in correct
>> shape? We know that this works, based on our recent experiences with
>> the other tests.
>
> I would argue that the simple '--test' will not do in this case, as the
> TAP tests do need a file named <test>.sql to compare the contents with.
> This file is generated either directly by pg_dump itself, or by running
> pg_restore on pg_dump's output. In the case of compression pg_dump will
> generate a <test>.sql.<compression program suffix> which can not be
> used in the comparison tests. So the intention of this block, is not to
> simply test for validity, but to also decompress pg_dump's output for it
> to be able to be used.

Ahh, I see, thanks. I would add a comment about that in the area of
compression_gzip_plain_format.

+ my $supports_compression = check_pg_config("#define HAVE_LIBZ 1");

This part could be moved within the if block a couple of lines down.

+ my $compress_program = $ENV{GZIP_PROGRAM};

It seems to me that it is enough to rely on {compress_cmd}, hence
there should be no need for $compress_program, no?

It seems to me that we should have a description for compress_cmd at
the top of 002_pg_dump.pl (close to "Definition of the pg_dump runs to
make"). There is an order dependency with restore_cmd.

> I updated the patch to simply remove the '-k' flag.

Okay.
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2022-03-30 05:55:51 Re: Extend compatibility of PostgreSQL::Test::Cluster
Previous Message Amit Kapila 2022-03-30 05:52:30 Re: Identify missing publications from publisher while create/alter subscription.