Re: Teach pg_receivewal to use lz4 compression

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: gkokolatos(at)pm(dot)me
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Teach pg_receivewal to use lz4 compression
Date: 2021-11-04 07:31:48
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Nov 03, 2021 at 09:11:24AM +0000, gkokolatos(at)pm(dot)me wrote:
> Please find v9 attached.

Thanks. I have looked at 0001 today, and applied it after fixing a
couple of issues. From memory:
- zlib.h was missing from pg_receivewal.c, issue that I noticed after
removing the redefinition of Z_DEFAULT_COMPRESSION because there was
no need for it (did a run with a --without-zlib as well).
- Simplified a bit the error handling for incorrect option
combinations, using a switch/case while on it.
- Renamed the existing variable "compression" in walmethods.c to
compression_level, to reduce any confusion with the introduction of
compression_method. One thing I have noticed is about the tar method,
where we rely on the compression level to decide if compression should
be used or not. There should be some simplifications possible there
but there is a huge take in receivelog.c where we use COMPRESSION_NONE
to track down that we still want to zero a new segment when using tar
- Use of 'I' as short option name, err... After applying the first

Based on the work of 0001, there were some conflicts with 0002. I
have solved them while reviewing it, and adapted the code to what got
already applied.

+ header_size = LZ4F_compressBegin(ctx, lz4buf, lz4bufsize, NULL);
+ if (LZ4F_isError(header_size))
+ {
+ pg_free(lz4buf);
+ close(fd);
+ return NULL;
+ }
In dir_open_for_write(), I guess that this one is missing one

+ status = LZ4F_freeDecompressionContext(ctx);
+ if (LZ4F_isError(status))
+ {
+ pg_log_error("could not free LZ4 decompression context: %s",
+ LZ4F_getErrorName(status));
+ exit(1);
+ }
+ if (uncompressed_size != WalSegSz)
+ {
+ pg_log_warning("compressed segment file \"%s\" has
incorrect uncompressed size %ld, skipping",
+ dirent->d_name, uncompressed_size);
+ (void) LZ4F_freeDecompressionContext(ctx);
+ continue;
+ }
When the uncompressed size does not match out expected size, the
second LZ4F_freeDecompressionContext() looks unnecessary to me because
we have already one a couple of lines above.

+ ctx_out = LZ4F_createCompressionContext(&ctx, LZ4F_VERSION);
+ lz4bufsize = LZ4F_compressBound(LZ4_IN_SIZE, NULL);
+ if (LZ4F_isError(ctx_out))
+ {
+ close(fd);
+ return NULL;
+ }
LZ4F_compressBound() can be after the check on ctx_out, here.

+ while (1)
+ {
+ char *readp;
+ char *readend;
Simply looping when decompressing a segment to check its size looks
rather unsafe to me. We should leave the loop once uncompressed_size
is strictly more than WalSegSz.

The amount of TAP tests looks fine, and that's consistent with what we
do for zlib^D^D^D^Dgzip. Now, when testing manually pg_receivewal
with various combinations of gzip, lz4 and none, I can see the
following failure in the code that calculates the streaming start
pg_receivewal: error: could not decompress file
"wals//000000010000000000000006.lz4": ERROR_frameType_unknown

In the LZ4 code, this points to lib/lz4frame.c, where we read an
incorrect header (see the part that does not match LZ4F_MAGICNUMBER).
The segments written by pg_receivewal are clean. Please note that
this shows up as well when manually compressing some segments with a
simple lz4 command, to simulate for example the case where a user
compressed some segments by himself/herself before running

So, tour code does LZ4F_createDecompressionContext() followed by a
loop on read() and LZ4F_decompress() that relies on an input and an
output buffer of a fixed 4kB size (we could use 64kB at least here
actually?). So this set of loops looks rather correct to me.

Now, this part is weird:
+ while (readp < readend)
+ {
+ size_t read_size = 1;
+ size_t out_size = 1;

I would have expected read_size to be (readend - readp) to match with
the remaining data in the read buffer that we still need to read.
Shouldn't out_size also be LZ4_CHUNK_SZ to match with the size of the
output buffer where all the contents are read? By setting it to 1, I
think that this is doing more loops into LZ4F_decompress() than really
necessary. It would not hurt either to memset(0) those buffers before
they are used, IMO. I am not completely sure either, but should we
use the number of bytes returned by LZ4F_decompress() as a hint for
the follow-up loops?

Attached is an updated patch, which includes fixes for most of the
issues I am mentioning above. Please note that I have not changed
FindStreamingStart(), so this part is the same as v9.


Attachment Content-Type Size
v10-0001-Teach-pg_receivewal-to-use-LZ4-compression.patch text/x-diff 20.3 KB

In response to


Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2021-11-04 07:38:00 Re: Missing include <openssl/x509.h> in be-secure-openssl.c?
Previous Message Shinya Kato 2021-11-04 07:00:06 Re: CREATEROLE and role ownership hierarchies