Re: Add LZ4 compression in pg_dump

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: gkokolatos(at)pm(dot)me
Cc: Justin Pryzby <pryzby(at)telsasoft(dot)com>, Alexander Lakhin <exclusion(at)gmail(dot)com>, shiy(dot)fnst(at)fujitsu(dot)com, Michael Paquier <michael(at)paquier(dot)xyz>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Rachel Heaton <rachelmheaton(at)gmail(dot)com>
Subject: Re: Add LZ4 compression in pg_dump
Date: 2023-03-31 23:34:16
Message-ID: cee5cb44-3690-0049-12e0-61b433615504@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 3/31/23 11:19, gkokolatos(at)pm(dot)me wrote:
>
>> ...
>>
>>
>> I think the patch is fine, but I'm wondering if the renames shouldn't go
>> a bit further. It removes references to LZ4File struct, but there's a
>> bunch of functions with LZ4File_ prefix. Why not to simply use LZ4_
>> prefix? We don't have GzipFile either.
>>
>> Sure, it might be a bit confusing because lz4.h uses LZ4_ prefix, but
>> then we probably should not define LZ4_compressor_init ...
>
> This is a good point. The initial thought was that since lz4.h is now
> removed, such ambiguity will not be present. In v2 of the patch the
> function is renamed to `LZ4State_compression_init` since this name
> describes better its purpose. It initializes the LZ4State for
> compression.
>
> As for the LZ4File_ prefix, I have no objections. Please find the
> prefix changed to LZ4Stream_. For the record, the word 'File' is not
> unique to the lz4 implementation. The common data structure used by
> the API in compress_io.h:
>
> typedef struct CompressFileHandle CompressFileHandle;
>
> The public functions for this API are named:
>
> InitCompressFileHandle
> InitDiscoverCompressFileHandle
> EndCompressFileHandle
>
> And within InitCompressFileHandle the pattern is:
>
> if (compression_spec.algorithm == PG_COMPRESSION_NONE)
> InitCompressFileHandleNone(CFH, compression_spec);
> else if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
> InitCompressFileHandleGzip(CFH, compression_spec);
> else if (compression_spec.algorithm == PG_COMPRESSION_LZ4)
> InitCompressFileHandleLZ4(CFH, compression_spec);
>
> It was felt that a prefix was required due to the inclusion 'lz4.h'
> header where naming functions as 'LZ4_' would be wrong. The prefix
> 'LZ4File_' seemed to be in line with the naming of the rest of
> the relevant functions and structures. Other compressions, gzip and
> none, did not face the same issue.
>
> To conclude, I think that having a prefix is slightly preferred
> over not having one. Since the prefix `LZ4File_` is not desired,
> I propose `LZ4Stream_` in v2.
>
> I will not object to dismissing the argument and drop `File` from
> the prefix, if so requested.
>

Thanks.

I think the LZ4Stream prefix is reasonable, so let's roll with that. I
cleaned up the patch a little bit (mostly comment tweaks, etc.), updated
the commit message and pushed it.

The main tweak I did is renaming all the LZ4State variables from "fs" to
state. The old name referred to the now abandoned "file state", but
after the rename to LZ4State that seems confusing. Some of the places
already used "state"and it's easier to know "state" is always LZ4State,
so let's keep it consistent.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jacob Champion 2023-03-31 23:35:50 Re: RFC: logical publication via inheritance root?
Previous Message Justin Pryzby 2023-03-31 23:16:31 Re: zstd compression for pg_dump