Re: Add LZ4 compression in pg_dump

From: gkokolatos(at)pm(dot)me
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Rachel Heaton <rachelmheaton(at)gmail(dot)com>
Subject: Re: Add LZ4 compression in pg_dump
Date: 2023-01-18 19:05:43
Message-ID: ljb5G0qPPj_MM5y-C5fRT24uuuj6n5I5RY__MpT7JdQ_B_AIE8vIOgdRO1uiBee6cymDzFLAKbYh3wDFv86jprLhVKOdx70D4g_4JTjqnY0=@pm.me
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

------- Original Message -------
On Wednesday, January 18th, 2023 at 3:00 PM, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com> wrote:

>
>
> Hi,
>
> On 1/16/23 16:14, gkokolatos(at)pm(dot)me wrote:
>
> > Hi,
> >
> > I admit I am completely at lost as to what is expected from me anymore.
>
<snip>
>
> Unfortunately, this plays against this patch - I'm certainly in favor of
> adding lz4 (and other compression algos) into pg_dump, but if I commit
> 0001 we get little benefit, and the other parts actually adding lz4/zstd
> are treated as "WIP / for completeness" so it's unclear when we'd get to
> commit them.

Thank you for your kindness and for taking the time to explain.

> So if I could recommend one thing, it'd be to get at least one of those
> WIP patches into a shape that's likely committable right after 0001.

This was clearly my fault. I misunderstood a suggestion upthread to focus
on the first patch of the series and ignore documentation and comments on
the rest.

Please find v21 to contain 0002 and 0003 in a state which I no longer consider
as WIP but worthy of proper consideration. Some guidance on where is best to add
documentation in 0002 for the function pointers in CompressFileHandle will
be welcomed.

>
> > I had posted v19-0001 for a committer's consideration and v19-000{2,3} for completeness.
> > Please find a rebased v20 attached.
>
>
> I took a quick look at 0001, so a couple comments (sorry if some of this
> was already discussed in the thread):

Much appreciated!

>
> 1) I don't think a "refactoring" patch should reference particular
> compression algorithms (lz4/zstd), and in particular I don't think we
> should have "not yet implemented" messages. We only have a couple other
> places doing that, when we didn't have a better choice. But here we can
> simply reject the algorithm when parsing the options, we don't need to
> do that in a dozen other places.

I have now removed lz4/zstd from where they were present with the exception
of pg_dump.c which is responsible for parsing.

> 2) I wouldn't reorder the cases in WriteDataToArchive, i.e. I'd keep
> "none" at the end. It might make backpatches harder.

Agreed. However a 'default' is needed in order to avoid compilation warnings.
Also note that 0002 completely does away with cases within WriteDataToArchive.

> 3) While building, I get bunch of warnings about missing cfdopen()
> prototype and pg_backup_archiver.c not knowing about cfdopen() and
> adding an implicit prototype (so I doubt it actually works).

Fixed. cfdopen() got prematurely introduced in 5e73a6048 and then got removed
in 69fb29d1af. v20 failed to properly take 69fb29d1af in consideration. Note
that cfdopen is removed in 0002 which explains why cfbot didn't complain.

> 4) "cfp" struct no longer wraps gzFile, but the comment was not updated.
> FWIW I'm not sure switching to "void *" is an improvement, maybe it'd be
> better to have a "union" of correct types?

Please find and updated comment and a union in place of the void *. Also
note that 0002 completely does away with cfp in favour of a new struct
CompressFileHandle. I maintained the void * there because it is used by
private methods of the compressors. 0003 contains such an example with
LZ4CompressorState.

> 5) cfopen/cfdopen are missing comments. cfopen_internal has an updated
> comment, but that's a static function while cfopen/cfdopen are the
> actual API.

Added comments to cfopen/cfdopen.

>
> > Also please let me know if I should silently step away from it and let other people lead
> > it. I would be glad to comply either way.
>
>
> Please don't. I promise to take a look at this patch again.

Thank you very much.

> Thanks for doing all the work.

Thank you.

Cheers,
//Georgios

> regards
>
> --
> Tomas Vondra
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company

Attachment Content-Type Size
v21-0002-Introduce-Compressor-API-in-pg_dump.patch text/x-patch 66.2 KB
v21-0001-Prepare-pg_dump-internals-for-additional-compres.patch text/x-patch 22.1 KB
v21-0003-Add-LZ4-compression-to-pg_dump.patch text/x-patch 29.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Karl O. Pinc 2023-01-18 19:06:05 Re: Doc: Rework contrib appendix -- informative titles, tweaked sentences
Previous Message Andrei Zubkov 2023-01-18 19:04:56 Re: [PATCH] Tracking statements entry timestamp in pg_stat_statements