Re: astreamer_lz4: fix bug of output pointer advancement in decompressor

From: Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: astreamer_lz4: fix bug of output pointer advancement in decompressor
Date: 2026-03-05 01:13:06
Message-ID: F79C0F6B-5211-4770-A4EB-3ADBEEFFE437@gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On Mar 5, 2026, at 01:16, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com> writes:
>> There have been a couple of LZ4-related patches recently, so I spent some time playing with the LZ4 path and found a bug in astreamer_lz4_decompressor_content().
>
> Yup, that's clearly wrong. I failed to reproduce a crash with the
> test hack you suggested, but no matter. Pushed with some cosmetic
> editorialization.

Hmm.. I just tried again. With applying nocfbot_hack.diff to an old branch, I can easily reproduce the bug:
```
chaol(at)ChaodeMacBook-Air cndb % pg_basebackup -D /tmp/bkup_lz4 -F t -Z lz4 -X stream -c fast
2026-03-05 09:01:53.461 CST [72896] LOG: checkpoint starting: fast force wait time
2026-03-05 09:01:53.466 CST [72896] LOG: checkpoint complete: fast force wait time: wrote 0 buffers (0.0%), wrote 0 SLRU buffers; 0 WAL file(s) added, 0 removed, 1 recycled; write=0.001 s, sync=0.001 s, total=0.006 s; sync files=0, longest=0.000 s, average=0.000 s; distance=16383 kB, estimate=29655 kB; lsn=0/14000080, redo lsn=0/14000028
chaol(at)ChaodeMacBook-Air cndb % pg_verifybackup -F t -n /tmp/bkup_lz4
pg_verifybackup: error: zsh: trace trap pg_verifybackup -F t -n /tmp/bkup_lz4
```

Then switching to the latest master, and also applying nocfbot_hack.diff:
```
chaol(at)ChaodeMacBook-Air postgresql % git diff
diff --git a/src/fe_utils/astreamer_lz4.c b/src/fe_utils/astreamer_lz4.c
index e196fcc81e5..35fd564df9a 100644
--- a/src/fe_utils/astreamer_lz4.c
+++ b/src/fe_utils/astreamer_lz4.c
@@ -331,10 +331,15 @@ astreamer_lz4_decompressor_content(astreamer *streamer,
size_t ret,
read_size,
out_size;
+ size_t hack_out_size;

read_size = avail_in;
out_size = avail_out;

+ if (out_size > 5)
+ hack_out_size = out_size - 5;
+ else
+ hack_out_size = out_size;
/*
* This call decompresses the data starting at next_in and generates
* the output data starting at next_out. It expects the caller to
@@ -349,13 +354,15 @@ astreamer_lz4_decompressor_content(astreamer *streamer,
* to out_size respectively.
*/
ret = LZ4F_decompress(mystreamer->dctx,
- next_out, &out_size,
+ next_out, &hack_out_size,
next_in, &read_size, NULL);

if (LZ4F_isError(ret))
pg_log_error("could not decompress data: %s",
LZ4F_getErrorName(ret));

+ out_size = hack_out_size;
+
/* Update input buffer based on number of bytes consumed */
avail_in -= read_size;
next_in += read_size;
```

Now, the bug goes away:
```
chaol(at)ChaodeMacBook-Air cndb % rm -rf /tmp/bkup_lz4
chaol(at)ChaodeMacBook-Air cndb % pg_basebackup -D /tmp/bkup_lz4 -F t -Z lz4 -X stream -c fast
2026-03-05 09:05:57.632 CST [72896] LOG: checkpoint starting: fast force wait
2026-03-05 09:05:57.634 CST [72896] LOG: checkpoint complete: fast force wait: wrote 0 buffers (0.0%), wrote 0 SLRU buffers; 0 WAL file(s) added, 0 removed, 2 recycled; write=0.001 s, sync=0.001 s, total=0.003 s; sync files=0, longest=0.000 s, average=0.000 s; distance=32768 kB, estimate=32768 kB; lsn=0/16000080, redo lsn=0/16000028
chaol(at)ChaodeMacBook-Air cndb % pg_verifybackup -F t -n /tmp/bkup_lz4
backup successfully verified
```

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Hayato Kuroda (Fujitsu) 2026-03-05 01:35:57 RE: Parallel Apply
Previous Message Michael Paquier 2026-03-05 01:08:18 Re: BUG: Former primary node might stuck when started as a standby