Re: Add LZ4 compression in pg_dump

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: gkokolatos(at)pm(dot)me
Cc: Justin Pryzby <pryzby(at)telsasoft(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Rachel Heaton <rachelmheaton(at)gmail(dot)com>
Subject: Re: Add LZ4 compression in pg_dump
Date: 2022-12-02 01:56:26
Message-ID: Y4lbSoDPlVdbes1u@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Dec 01, 2022 at 02:58:35PM +0000, gkokolatos(at)pm(dot)me wrote:
> Nuking the warning from orbit and changing the behaviour around disabling
> the requested compression when the libraries are not present, should not
> mean that we need to change the behaviour of default values for different
> formats. Please find v13 attached which reinstates it.

Gah, thanks! And this default behavior is documented as dependent on
the compilation as well.

> Which in itself it got me looking and wondering why the tests succeeded.
> The only existing test covering that path is `defaults_dir_format` in
> `002_pg_dump.pl`. However as the test is currently written it does not
> check whether the output was compressed. The restore command would succeed
> in either case. A simple `gzip -t -r` against the directory will not
> suffice to test it, because there exist files which are never compressed
> in this format (.toc). A little bit more involved test case would need
> to be written, yet before I embark to this journey, I would like to know
> if you would agree to reinstate the defaults for those formats.

On top of my mind, I briefly recall that -r is not that portable. And
the toc format makes the files generated non-deterministic as these
use OIDs..

[.. thinks ..]

We are going to need a new thing here, as compress_cmd cannot be
directly used. What if we used only an array of glob()-able elements?
Let's say "expected_contents" that could include a "dir_path/*.gz"
conditional on $supports_gzip? glob() can only be calculated when the
test is run as the file names cannot be known beforehand :/

>> As per the patch, it is true that we do not need to bump the format of
>> the dump archives, as we can still store only the compression level
>> and guess the method from it. I have added some notes about that in
>> ReadHead and WriteHead to not forget.
>
> Agreed. A minor suggestion if you may.
>
> #ifndef HAVE_LIBZ
> - if (AH->compression != 0)
> + if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
> pg_log_warning("archive is compressed, but this installation does not support compression -- no data will be available");
> #endif
>
> It would seem a more consistent to error out in this case. We do error
> in all other cases where the compression is not available.

Makes sense.

I have gone through the patch again, and applied it. Thanks!
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2022-12-02 02:10:17 Re: Using AF_UNIX sockets always for tests on Windows
Previous Message Tom Lane 2022-12-02 01:56:18 Re: Using AF_UNIX sockets always for tests on Windows