Re: Teach pg_receivewal to use lz4 compression

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: gkokolatos(at)pm(dot)me
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Teach pg_receivewal to use lz4 compression
Date: 2021-06-30 03:38:03
Message-ID: YNvnG9okUi1B5MIu@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jun 29, 2021 at 02:45:17PM +0000, gkokolatos(at)pm(dot)me wrote:
> The program pg_receivewal can use gzip compression to store the received WAL.
> This patch teaches it to be able to use lz4 compression if the binary is build
> using the -llz4 flag.

Nice.

> Previously, the user had to use the option --compress with a value between [0-9]
> to denote that gzip compression was requested. This specific behaviour is
> maintained. A newly introduced option --compress-program=lz4 can be used to ask
> for the logs to be compressed using lz4 instead. In that case, no compression
> values can be selected as it does not seem too useful.

Yes, I am not convinced either that we should care about making the
acceleration customizable.

> Under the hood there is nothing exceptional to be noted. Tar based archives have
> not yet been taught to use lz4 compression. Those are used by pg_basebackup. If
> is is felt useful, then it is easy to be added in a new patch.

Documentation is missing from the patch.

+ LZ4F_compressionContext_t ctx;
+ size_t outbufCapacity;
+ void *outbuf;
It may be cleaner to refer to lz4 in the name of those variables?

+ ctx_out = LZ4F_createCompressionContext(&ctx, LZ4F_VERSION);
+ outbufCapacity = LZ4F_compressBound(LZ4_IN_SIZE, NULL /* default preferences */);
Interesting. So this cannot be done at compilation time because of
the auto-flush mode looking at the LZ4 code. That looks about right.

getopt_long() is forgotting the new option 'I'.

+ system_or_bail('lz4', '-t', $lz4_wals[0]);
I think that you should just drop this part of the test. The only
part of LZ4 that we require to be present when Postgres is built with
--with-lz4 is its library liblz4. Commands associated to it may not
be around, causing this test to fail. The test checking that one .lz4
file has been created is good to have. It may be worth adding a test
with a .lz4.partial segment generated and --endpos pointing to a LSN
that does not finish the segment that gets switched.

It seems to me that you are missing some logic in
FindStreamingStart() to handle LZ4-compressed segments, in relation
with IsCompressXLogFileName() and IsPartialCompressXLogFileName().

+ pg_log_error("invalid compress-program \"%s\"", optarg);
"compress-program" sounds weird. Shouldn't that just say "invalid
compression method" or similar?

+ printf(_(" -Z, --compress=0-9 compress logs with given
compression level (available only with compress-program=zlib)\n"));
This line is too long.

Should we have more tests for ZLIB, while on it? That seems like a
good addition as long as we can skip the tests conditionally when
that's not supported.
--
Michael

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2021-06-30 03:59:30 Re: Allow streaming the changes after speculative aborts.
Previous Message Noah Misch 2021-06-30 03:07:27 Re: What is "wraparound failure", really?