From: | Mikhail Gribkov <youzhick(at)gmail(dot)com> |
---|---|
To: | Pg Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Decompression bug in astreamer_lz4 |
Date: | 2025-06-25 20:07:32 |
Message-ID: | CAMEv5_uQS1Hg6KCaEP2JkrTBbZ-nXQhxomWrhYQvbdzR-zy-wA@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi hackers,
While developing my extension, I faced some problems in using
pg_basebackup/pg_verifybackup on a cluster with additional custom files
saved. The extension saves its own format files, which may have a variety
of sizes and usually quite low compression potential.
The same situation can be reproduced using randomly filled files. Let’s
imagine the extension saves its own statistics in pg_stat:
dd if=/dev/urandom of=$PGDATA/pg_stat/junk bs=1M count=16
Now we’ll run the server and make an lz4-packed backup:
$PGINSTALL/bin/pg_basebackup --no-sync -cfast --target
server:/tmp/server-backup -Xfetch --compress server-lz4
Now we’ll check the backup:
$PGINSTALL/bin/pg_verifybackup -n /tmp/server-backup
The verification process will fail. The exact error may vary, but the most
common version looks like this:
pg_verifybackup: error: checksum mismatch for file "pg_stat/junk" in
archive "base.tar.lz4"
pg_verifybackup: error: "base/4/3431" is present in the manifest but not on
disk
pg_verifybackup: error: "base/5/2620" is present in the manifest but not on
disk
pg_verifybackup: error: "base/1/2617" is present in the manifest but not on
disk
pg_verifybackup: error: "base/4/3576" is present in the manifest but not on
disk
......
Using gzip or zstd compressions in the same setup doesn’t lead to any
problems. Moreover, the lz4 archive itself is fine and it can be
successfully unpacked via cli lz4 unpacker. The problem thus is in reading,
and some investigation done by me and my colleagues lead us to
astreamer_lz4.c file.
Looks like astreamer_lz4_decompressor_content() function is initializing
_out-parameters incorrectly:
next_out = (uint8 *) mystreamer->base.bbs_buffer.data;
avail_out = mystreamer->base.bbs_buffer.maxlen;
, while should be:
next_out = (uint8 *) mystreamer->base.bbs_buffer.data +
mystreamer->bytes_written;
avail_out = mystreamer->base.bbs_buffer.maxlen - mystreamer->bytes_written;
And this would be consistent with what the gzip version of the function
(astreamer_gzip_decompressor_content()) does.
In the vast majority of cases the mystreamer->bytes_written value is zero
at the moment of next_out/avail_out initialization, thus the problem does
not appear. In fact we were totally unable to reproduce non-zero initial
value of mystreamer->bytes_written without poor-compressible custom format
files.I’m not that deep in the compression algorithms to say exactly why is
that, and I realize that very few users will really face the problem, but
the problem is definitely there.
Attached is a simple patch implementing the above fix. The issue disappears
with it.
What do you think?
--
best regards,
Mikhail A. Gribkov
Attachment | Content-Type | Size |
---|---|---|
v1-Fix-lz4-decompressor.patch | application/octet-stream | 651 bytes |
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Geoghegan | 2025-06-25 21:44:22 | Re: Making Row Comparison NULL row member handling more robust during skip scans |
Previous Message | Jelte Fennema-Nio | 2025-06-25 19:59:23 | Re: Huge commitfest app update upcoming: Tags, Draft CF, Help page, and automated commitfest creat/open/close |