Re: Add LZ4 compression in pg_dump

From: Justin Pryzby <pryzby(at)telsasoft(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: gkokolatos(at)pm(dot)me, pgsql-hackers(at)lists(dot)postgresql(dot)org, Rachel Heaton <rachelmheaton(at)gmail(dot)com>
Subject: Re: Add LZ4 compression in pg_dump
Date: 2023-01-16 01:56:25
Message-ID: 20230116015625.GH9837@telsasoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jan 16, 2023 at 10:28:50AM +0900, Michael Paquier wrote:
> On Sat, Jan 14, 2023 at 03:43:09PM -0600, Justin Pryzby wrote:
> > On Sun, Jan 08, 2023 at 01:45:25PM -0600, Justin Pryzby wrote:
> >> pg_compress_specification is being passed by value, but I think it
> >> should be passed as a pointer, as is done everywhere else.
> >
> > ISTM that was an issue with 5e73a6048, affecting a few public and
> > private functions. I wrote a pre-preparatory patch which changes to
> > pass by reference.
>
> The functions changed by 0001 are cfopen[_write](),
> AllocateCompressor() and ReadDataFromArchive(). Why is it a good idea
> to change these interfaces which basically exist to handle inputs?

I changed to pass pg_compress_specification as a pointer, since that's
the usual convention for structs, as followed by the existing uses of
pg_compress_specification.

> Is there some benefit in changing compression_spec within the
> internals of these routines before going back one layer down to their
> callers? Changing the compression_spec on-the-fly in these internal
> paths could be risky, actually, no?

I think what you're saying is that if the spec is passed as a pointer,
then the called functions shouldn't set spec->algorithm=something.

I agree that if they need to do that, they should use a local variable.
Which looks to be true for the functions that were changed in 001.

> > And addressed a handful of other issues I reported as separate fixup
> > commits. And changed to use LZ4 by default for CI.
>
> Are your slight changes shaped as of 0003-f.patch, 0005-f.patch and
> 0007-f.patch on top of the original patches sent by Georgios?

Yes, the original patches, rebased as needed on top of HEAD and 001...

> >> pg_compress_algorithm is being writen directly into the pg_dump header.
>
> Do you mean that this is what happens once the patch series 0001~0008
> sent upthread is applied on HEAD?

Yes

> - /*
> - * For now the compression type is implied by the level. This will need
> - * to change once support for more compression algorithms is added,
> - * requiring a format bump.
> - */
> - WriteInt(AH, AH->compression_spec.level);
> + AH->WriteBytePtr(AH, AH->compression_spec.algorithm);
>
> I may be missing something here, but it seems to me that you ought to
> store as well the level in the dump header, or it would not be
> possible to report in the dump's description what was used? Hence,
> K_VERS_1_15 should imply that we have both the method compression and
> the compression level.

Maybe. But the "level" isn't needed for decompression for any case I'm
aware of.

Also, dumps with the default compression level currently say:
"Compression: -1", which does't seem valuable.

--
Justin

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message gkokolatos 2023-01-16 02:27:56 Re: Add LZ4 compression in pg_dump
Previous Message Michael Paquier 2023-01-16 01:53:40 Re: constify arguments of copy_file() and copydir()