Re: Add LZ4 compression in pg_dump

From: gkokolatos(at)pm(dot)me
To: Justin Pryzby <pryzby(at)telsasoft(dot)com>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Rachel Heaton <rachelmheaton(at)gmail(dot)com>
Subject: Re: Add LZ4 compression in pg_dump
Date: 2023-01-26 11:24:47
Message-ID: 6YZqN2HEE4yJxxxXWiMl4QWwKzLon0SdKETTGVkT9zjWnXMftGZjLKHHjGFuwC_5I6xL2s8nMXdLQGA4vEFveYXSHbfD9K9pDQvEJF2y99o=@pm.me
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

------- Original Message -------
On Thursday, January 26th, 2023 at 7:28 AM, Justin Pryzby <pryzby(at)telsasoft(dot)com> wrote:

>
>
> On Thu, Jan 26, 2023 at 02:49:27PM +0900, Michael Paquier wrote:
>
> > On Wed, Jan 25, 2023 at 12:00:20PM -0600, Justin Pryzby wrote:
> >
> > > While looking at this, I realized that commit 5e73a6048 introduced a
> > > regression:
> > >
> > > @@ -3740,19 +3762,24 @@ ReadHead(ArchiveHandle *AH)
> > >
> > > - if (AH->compression != 0)
> > > - pg_log_warning("archive is compressed, but this installation does not support compression -- no data will be available");
> > > + if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
> > > + pg_fatal("archive is compressed, but this installation does not support compression");
> > >
> > > Before, it was possible to restore non-data chunks of a dump file, even
> > > if the current build didn't support its compression. But that's now
> > > impossible - and it makes the code we're discussing in RestoreArchive()
> > > unreachable.
> >
> > Right. The impacts the possibility of looking at the header data,
> > which is useful with pg_restore -l for example.
>
>
> It's not just header data - it's schema and (I think) everything other
> than table data.
>
> > > The coverage report disagrees with me, though...
> > > https://coverage.postgresql.org/src/bin/pg_dump/pg_backup_archiver.c.gcov.html#3901
> >
> > Isn't that one of the tests like compression_gzip_plain?
>
>
> I'm not sure what you mean. Plain dump is restored with psql and not
> with pg_restore.
>
> My line number was wrong:
> https://coverage.postgresql.org/src/bin/pg_dump/pg_backup_archiver.c.gcov.html#390
>
> What test would hit that code without rebuilding ?
>
> 394 : #ifndef HAVE_LIBZ
> 395 : if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP &&
>
> > Thoughts?
> > #ifndef HAVE_LIBZ
> > if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
> > - pg_fatal("archive is compressed, but this installation does not support compression");
> > + pg_log_warning("archive is compressed, but this installation does not support compression -- no data will be available");
>
>
> Your patch is fine for now, but these errors should eventually specify
> which compression algorithm is unavailable. I think that should be a
> part of the 001 patch, ideally in a way that minimizes the number of
> places which need to be updated when adding an algorithm.

I gave this a little bit of thought. I think that ReadHead should not
emit a warning, or at least not this warning as it is slightly misleading.
It implies that it will automatically turn off data restoration, which is
false. Further ahead, the code will fail with a conflicting error message
stating that the compression is not available.

Instead, it would be cleaner both for the user and the maintainer to
move the check in RestoreArchive and make it the sole responsible for
this logic.

Please find v26 attached. 0001 does the above and 0002 addresses Justin's
complaints regarding the code footprint.

//Cheers,
Georgios

>
> --
> Justin

Attachment Content-Type Size
v26-0002-Prepare-pg_dump-internals-for-additional-compres.patch text/x-patch 13.7 KB
v26-0001-Address-regression-in-pg_dump-s-ReadHead.patch text/x-patch 2.1 KB
v26-0004-Add-LZ4-compression-to-pg_dump.patch text/x-patch 29.7 KB
v26-0003-Introduce-Compress-and-Compressor-API-in-pg_dump.patch text/x-patch 68.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2023-01-26 11:53:28 Re: Add LZ4 compression in pg_dump
Previous Message David Geier 2023-01-26 11:21:13 Re: Reduce timing overhead of EXPLAIN ANALYZE using rdtsc?