From: | Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Dimitrios Apostolou <jimis(at)gmx(dot)net>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: [PING] [PATCH v2] parallel pg_restore: avoid disk seeks when jumping short distance forward |
Date: | 2025-10-13 03:33:06 |
Message-ID: | CBF70398-279C-4BA7-9FA1-2879A49F4EC9@gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
> On Oct 12, 2025, at 09:25, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> While playing around with the test cases for pg_dump compression,
> I was startled to discover that the performance of compress_lz4's
> "stream API" code is absolutely abysmal. Here is a simple test
> case to demonstrate, using the regression database as test data:
>
> $ pg_dump -Fd --compress=lz4 -f rlz4.dir regression
> $ time pg_restore -f /dev/null rlz4.dir
>
> real 0m0.023s
> user 0m0.017s
> sys 0m0.006s
>
> So far so good, but now let's compress the toc.dat file:
>
> $ lz4 -f -m --rm rlz4.dir/toc.dat
> $ time pg_restore -f /dev/null rlz4.dir
>
> real 0m1.335s
> user 0m1.326s
> sys 0m0.008s
>
> Considering that lz4 prides itself on fast decompression speed,
> that is not a sane result. Decompressing the file only requires
> a couple ms on my machine:
>
> $ time lz4cat rlz4.dir/toc.dat.lz4 >/dev/null
>
> real 0m0.002s
> user 0m0.000s
> sys 0m0.002s
>
> So on this example, pg_restore is something more than 600x slower
> to read the TOC data than it ought to be.
>
I tested the patch on my MacBook M4 (Sequoia 15.6.1), compiled with -O2 optimization:
- For LZ4 performance improvement:
Without the fix (latest master):
```
chaol(at)ChaodeMacBook-Air tmp % pg_dump -Fd --compress=lz4 -f rlz4.dir regression
chaol(at)ChaodeMacBook-Air tmp % time pg_restore -f /dev/null rlz4.dir
pg_restore -f /dev/null rlz4.dir 0.03s user 0.03s system 11% cpu 0.463 total
chaol(at)ChaodeMacBook-Air tmp % lz4 -f -m --rm rlz4.dir/toc.dat
# It took 1.59s to restore from compressed data, much slower than from uncompressed data.
chaol(at)ChaodeMacBook-Air tmp % time pg_restore -f /dev/null rlz4.dir
pg_restore -f /dev/null rlz4.dir 1.59s user 0.02s system 97% cpu 1.653 total
```
With the fix:
```
chaol(at)ChaodeMacBook-Air tmp % pg_dump -Fd --compress=lz4 -f rlz4.dir regression
chaol(at)ChaodeMacBook-Air tmp % time pg_restore -f /dev/null rlz4.dir
pg_restore -f /dev/null rlz4.dir 0.02s user 0.03s system 16% cpu 0.305 total
chaol(at)ChaodeMacBook-Air tmp % lz4 -f -m --rm rlz4.dir/toc.dat
# Much faster !!!
chaol(at)ChaodeMacBook-Air tmp % time pg_restore -f /dev/null rlz4.dir
pg_restore -f /dev/null rlz4.dir 0.02s user 0.02s system 93% cpu 0.043 total
```
- For Gzip zero read bug:
Without the fix:
```
chaol(at)ChaodeMacBook-Air tmp % pg_dump -Fd --compress=gzip -f rgzip.dir regression
chaol(at)ChaodeMacBook-Air tmp % gzip rgzip.dir/toc.dat
# Failed
chaol(at)ChaodeMacBook-Air tmp % pg_restore -f /dev/null rgzip.dir
pg_restore: error: could not read from input file:
```
With the fix:
```
chaol(at)ChaodeMacBook-Air tmp % pg_dump -Fd --compress=gzip -f rgzip.dir regression
chaol(at)ChaodeMacBook-Air tmp % gzip rgzip.dir/toc.dat
# Ran successfully
chaol(at)ChaodeMacBook-Air tmp % pg_restore -f /dev/null rgzip.dir
```
I also reviewed the code change, basically LGTM. Just a couple of trivial comments:
1 - 0001
In WriteDataToArchiveLZ4()
```
+ required = LZ4F_compressBound(chunk, &state->prefs);
+ if (required > state->buflen - state->bufdata)
+ {
+ cs->writeF(AH, state->buffer, state->bufdata);
+ state->bufdata = 0;
+ }
```
And in EndCompressorLZ4()
```
+ required = LZ4F_compressBound(0, &state->prefs);
+ if (required > state->buflen - state->bufdata)
+ {
+ cs->writeF(AH, state->buffer, state->bufdata);
+ state->bufdata = 0;
+ }
```
These two code pieces are similar, only difference is the first parameter passed to LZ4F_compressBound().
I think we can create an inline function for it. But these are just 5 lines, I don’t have a strong option on that.
Same thing for the other code pieces in LZ4Stream_write() and LZ4Stream_close().
2 - 0003
```
/*
* LZ4 equivalent to feof() or gzeof(). Return true iff there is no
- * decompressed output in the overflow buffer and the end of the backing file
```
This doesn’t belong to the current patch. But “iff” seems a typo of “if”. You may fix it as you are touching this piece of code.
Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/
From | Date | Subject | |
---|---|---|---|
Next Message | David G. Johnston | 2025-10-13 03:39:31 | Re: [PATCH v1] parallel pg_restore: avoid disk seeks when jumping short distance forward |
Previous Message | Xuneng Zhou | 2025-10-13 03:20:19 | Re: pgstattuple: Use streaming read API in pgstatindex functions |