Re: Add LZ4 compression in pg_dump

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Justin Pryzby <pryzby(at)telsasoft(dot)com>
Cc: gkokolatos(at)pm(dot)me, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Rachel Heaton <rachelmheaton(at)gmail(dot)com>
Subject: Re: Add LZ4 compression in pg_dump
Date: 2023-01-26 05:49:27
Message-ID: Y9IUZzQsusY3VuyJ@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jan 25, 2023 at 12:00:20PM -0600, Justin Pryzby wrote:
> While looking at this, I realized that commit 5e73a6048 introduced a
> regression:
>
> @@ -3740,19 +3762,24 @@ ReadHead(ArchiveHandle *AH)
>
> - if (AH->compression != 0)
> - pg_log_warning("archive is compressed, but this installation does not support compression -- no data will be available");
> + if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
> + pg_fatal("archive is compressed, but this installation does not support compression");
>
> Before, it was possible to restore non-data chunks of a dump file, even
> if the current build didn't support its compression. But that's now
> impossible - and it makes the code we're discussing in RestoreArchive()
> unreachable.

Right. The impacts the possibility of looking at the header data,
which is useful with pg_restore -l for example. On a dump that's been
compressed, pg_restore <= 15 would always print the TOC entries with
or without compression support. On HEAD, this code prevents the
header lookup. All *nix or BSD platforms should have support for
zlib, I hope.. Still that could be an issue on Windows, and this
would prevent folks to check the contents of the dumps after saving it
on a WIN32 host, so let's undo that.

So, I have been testing the attached with four sets of binaries from
15/HEAD and with[out] zlib support, and this brings HEAD back to the
pre-15 state (header information able to show up, still failure when
attempting to restore the dump's data without zlib).

> I don't think we can currently test for that, since it requires creating a dump
> using a build --with compression and then trying to restore using a build
> --without compression.

Right, the location of the data is in the header, and I don't see how
you would be able to do that without two sets of binaries at hand, but
our tests run under the assumption that you have only one. Well,
that's not entirely true as well, as you could create a TAP test like
pg_upgrade that relies on a environment variable pointing to a second
set of binaries. That's not worth the complication involved, IMO.

> The coverage report disagrees with me, though...
> https://coverage.postgresql.org/src/bin/pg_dump/pg_backup_archiver.c.gcov.html#3901

Isn't that one of the tests like compression_gzip_plain?

Thoughts?
--
Michael

Attachment Content-Type Size
dump-header-compress.patch text/x-diff 601 bytes

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Justin Pryzby 2023-01-26 06:28:46 Re: Add LZ4 compression in pg_dump
Previous Message Bharath Rupireddy 2023-01-26 05:33:28 Re: Improve WALRead() to suck data directly from WAL buffers when possible