Re: Add LZ4 compression in pg_dump

From: gkokolatos(at)pm(dot)me
To: Justin Pryzby <pryzby(at)telsasoft(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org, Rachel Heaton <rachelmheaton(at)gmail(dot)com>
Subject: Re: Add LZ4 compression in pg_dump
Date: 2023-01-16 02:27:56
Message-ID: We5rC7EDgemlJsuSvwZ7vG9-1pPDmML3u7EbbXcZGTItFEg5UoaNfATLNBVDJS2D4PShvyAoUBUvLlNnNDOtkmZZ09knXJ5aV16neuIw2UY=@pm.me
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Oh, I didn’t realize you took over Justin? Why? After almost a year of work?

This is rather disheartening.

On Mon, Jan 16, 2023 at 02:56, Justin Pryzby <pryzby(at)telsasoft(dot)com> wrote:

> On Mon, Jan 16, 2023 at 10:28:50AM +0900, Michael Paquier wrote:
>> On Sat, Jan 14, 2023 at 03:43:09PM -0600, Justin Pryzby wrote:
>> > On Sun, Jan 08, 2023 at 01:45:25PM -0600, Justin Pryzby wrote:
>> >> pg_compress_specification is being passed by value, but I think it
>> >> should be passed as a pointer, as is done everywhere else.
>> >
>> > ISTM that was an issue with 5e73a6048, affecting a few public and
>> > private functions. I wrote a pre-preparatory patch which changes to
>> > pass by reference.
>>
>> The functions changed by 0001 are cfopen[_write](),
>> AllocateCompressor() and ReadDataFromArchive(). Why is it a good idea
>> to change these interfaces which basically exist to handle inputs?
>
> I changed to pass pg_compress_specification as a pointer, since that's
> the usual convention for structs, as followed by the existing uses of
> pg_compress_specification.
>
>> Is there some benefit in changing compression_spec within the
>> internals of these routines before going back one layer down to their
>> callers? Changing the compression_spec on-the-fly in these internal
>> paths could be risky, actually, no?
>
> I think what you're saying is that if the spec is passed as a pointer,
> then the called functions shouldn't set spec->algorithm=something.
>
> I agree that if they need to do that, they should use a local variable.
> Which looks to be true for the functions that were changed in 001.
>
>> > And addressed a handful of other issues I reported as separate fixup
>> > commits. And changed to use LZ4 by default for CI.
>>
>> Are your slight changes shaped as of 0003-f.patch, 0005-f.patch and
>> 0007-f.patch on top of the original patches sent by Georgios?
>
> Yes, the original patches, rebased as needed on top of HEAD and 001...
>
>> >> pg_compress_algorithm is being writen directly into the pg_dump header.
>>
>> Do you mean that this is what happens once the patch series 0001~0008
>> sent upthread is applied on HEAD?
>
> Yes
>
>> - /*
>> - * For now the compression type is implied by the level. This will need
>> - * to change once support for more compression algorithms is added,
>> - * requiring a format bump.
>> - */
>> - WriteInt(AH, AH->compression_spec.level);
>> + AH->WriteBytePtr(AH, AH->compression_spec.algorithm);
>>
>> I may be missing something here, but it seems to me that you ought to
>> store as well the level in the dump header, or it would not be
>> possible to report in the dump's description what was used? Hence,
>> K_VERS_1_15 should imply that we have both the method compression and
>> the compression level.
>
> Maybe. But the "level" isn't needed for decompression for any case I'm
> aware of.
>
> Also, dumps with the default compression level currently say:
> "Compression: -1", which does't seem valuable.
>
> --
> Justin

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2023-01-16 02:28:13 Re: Fix GUC_NO_SHOW_ALL test scenario in 003_check_guc.pl
Previous Message Justin Pryzby 2023-01-16 01:56:25 Re: Add LZ4 compression in pg_dump