Re: Add LZ4 compression in pg_dump

From: gkokolatos(at)pm(dot)me
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Justin Pryzby <pryzby(at)telsasoft(dot)com>, Rachel Heaton <rachelmheaton(at)gmail(dot)com>, Greg Stark <stark(at)mit(dot)edu>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Add LZ4 compression in pg_dump
Date: 2022-03-29 09:46:27
Message-ID: yTrMijGhpyxYVVeJuJPM5NandUq_5Qq647cqNuUpVkXSfbYPDj0VLfRK72jB2HWoeDgRIYIbjG7YbBrUuzv8fRF9Y5vALIrM9BGIgQLh8vM=@pm.me
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

------- Original Message -------

On Tuesday, March 29th, 2022 at 9:27 AM, Michael Paquier <michael(at)paquier(dot)xyz> wrote:

> On Sat, Mar 26, 2022 at 01:14:41AM -0500, Justin Pryzby wrote:
> > See 0001 and the manpage.
> > + 'pg_dump: compression is not supported by tar archive format');
> > When I submitted a patch to support zstd, I spent awhile trying to make
> > compression work with tar, but it's a significant effort and better done
> > separately.
>
> Wow. This stuff is old enough to vote (c3e18804), dead since its
> introduction. There is indeed an argument for removing that, it is
> not good to keep around that that has never been stressed and/or
> used. Upon review, the cleanup done looks correct, as we have never
> been able to generate .dat.gz files in for a dump in the tar format.

Correct. My driving force behind it was to ease up the cleanup/refactoring
work that follows, by eliminating the callers of the GZ*() macros.

> + command_fails_like(
>
> + [ 'pg_dump', '--compress', '1', '--format', 'tar' ],
> This addition depending on HAVE_LIBZ is a good thing as a reminder of
> any work that could be done in 0002. Now that's waiting for 20 years
> so I would not hold my breath on this support. I think that this
> could be just applied first, with 0002 on top of it, as a first
> improvement.

Excellent, thank you.

> + compress_cmd => [
> + $ENV{'GZIP_PROGRAM'},
> Patch 0001 is missing and update of pg_dump's Makefile to pass down
> this environment variable to the test scripts, no?

Agreed. It was not properly moved forward. Fixed.

> + compress_cmd => [
> + $ENV{'GZIP_PROGRAM'},
> + '-f',
> [...]
> + $ENV{'GZIP_PROGRAM'},
> + '-k', '-d',
> -f and -d are available everywhere I looked at, but is -k/--keep a
> portable choice with a gzip command? I don't see this option in
> OpenBSD, for one. So this test is going to cause problems on those
> buildfarm machines, at least. Couldn't this part be replaced by a
> simple --test to check that what has been compressed is in correct
> shape? We know that this works, based on our recent experiences with
> the other tests.

I would argue that the simple '--test' will not do in this case, as the
TAP tests do need a file named <test>.sql to compare the contents with.
This file is generated either directly by pg_dump itself, or by running
pg_restore on pg_dump's output. In the case of compression pg_dump will
generate a <test>.sql.<compression program suffix> which can not be
used in the comparison tests. So the intention of this block, is not to
simply test for validity, but to also decompress pg_dump's output for it
to be able to be used.

I updated the patch to simply remove the '-k' flag.

Please find v3 attached. (only 0001 and 0002 are relevant, 0003 and
0004 are only for reference and are currently under active modification).

Cheers,
//Georgios

Attachment Content-Type Size
v3-0004-Add-LZ4-compression-in-pg_-dump-restore.patch text/x-patch 46.7 KB
v3-0003-Prepare-pg_dump-for-additional-compression-method.patch text/x-patch 52.9 KB
v3-0001-Extend-compression-coverage-for-pg_dump-pg_restor.patch text/x-patch 5.9 KB
v3-0002-Remove-unsupported-bitrot-compression-from-pg_dum.patch text/x-patch 4.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2022-03-29 10:00:22 Re: Column Filtering in Logical Replication
Previous Message Nikita Malakhov 2022-03-29 09:41:47 Generic JSON API