Re: Add LZ4 compression in pg_dump

From: gkokolatos(at)pm(dot)me
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Justin Pryzby <pryzby(at)telsasoft(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Rachel Heaton <rachelmheaton(at)gmail(dot)com>
Subject: Re: Add LZ4 compression in pg_dump
Date: 2022-12-05 12:48:28
Message-ID: DQn4czCWR1rcbGPLL7p3LfEr5-kGmlySm-H05VgroINdikvhtS5r9EdI6b8D8sjnbKdJ09k-cxs2AqijBeHAWk9Q8gvEAxPRHuLRhwONcGc=@pm.me
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

------- Original Message -------
On Monday, December 5th, 2022 at 8:05 AM, Michael Paquier <michael(at)paquier(dot)xyz> wrote:

>
>
> On Sat, Dec 03, 2022 at 11:45:30AM +0900, Michael Paquier wrote:
>
> > While this is correct in checking that the contents are compressed
> > under --with-zlib, this also removes the coverage where we make sure
> > that this command is able to complete under --without-zlib without
> > compressing any of the table data files. Hence my point from
> > upthread: this test had better not use compile_option, but change
> > glob_pattern depending on if the build uses zlib or not.
>
> In short, I mean something like the attached. I have named the flag
> content_patterns, and switched it to an array so as we can check that
> toc.dat is always uncompression and that the other data files are
> always uncompressed.

I see. This approach is much better than my proposal, thanks. If you
allow me, I find 'content_patterns' to be slightly ambiguous. While is
true that it refers to the contents of a directory, it is not the
contents of the dump that it is examining. I took the liberty of proposing
an alternative name in the attached v16.

I also took the liberty of applying the test pattern when it the dump
is explicitly compressed.

> > In order to check this behavior with defaults_custom_format, perhaps
> > we could just remove the -Z6 from it or add an extra command for its
> > default behavior?
>
> This is slightly more complicated as there is just one file generated
> for the compression and non-compression cases, so I have let that as
> it is now.

I was thinking a bit more about this. I think that we can use the list
TOC option of pg_restore. This option will first print out the header
info which contains the compression. Perl utils already support to
parse the generated output of a command. Please find an attempt to do
so in the attached. The benefits of having some testing for this case
become a bit more obvious in 0004 of the patchset, when lz4 is
introduced.

Cheers,
//Georgios

> --
> Michael

Attachment Content-Type Size
v16-0001-Provide-coverage-for-pg_dump-default-compression.patch text/x-patch 5.3 KB
v16-0002-Prepare-pg_dump-internals-for-additional-compres.patch text/x-patch 19.9 KB
v16-0003-Introduce-Compressor-API-in-pg_dump.patch text/x-patch 56.8 KB
v16-0004-Add-LZ4-compression-in-pg_-dump-restore.patch text/x-patch 28.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Melih Mutlu 2022-12-05 13:00:12 Re: [PATCH] Reuse Workers and Replication Slots during Logical Replication
Previous Message Alvaro Herrera 2022-12-05 12:05:49 Re: move some bitmapset.c macros to bitmapset.h