From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com> |
Cc: | Dimitrios Apostolou <jimis(at)gmx(dot)net>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: [PING] [PATCH v2] parallel pg_restore: avoid disk seeks when jumping short distance forward |
Date: | 2025-10-13 21:07:52 |
Message-ID: | 717335.1760389672@sss.pgh.pa.us |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com> writes:
>> On Oct 12, 2025, at 09:25, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> While playing around with the test cases for pg_dump compression,
>> I was startled to discover that the performance of compress_lz4's
>> "stream API" code is absolutely abysmal.
> I tested the patch on my MacBook M4 (Sequoia 15.6.1), compiled with -O2 optimization:
Thanks for looking at it!
> I also reviewed the code change, basically LGTM. Just a couple of trivial comments:
> 1 - 0001
> In WriteDataToArchiveLZ4()
> ```
> + required = LZ4F_compressBound(chunk, &state->prefs);
> + if (required > state->buflen - state->bufdata)
> + {
> + cs->writeF(AH, state->buffer, state->bufdata);
> + state->bufdata = 0;
> + }
> ```
> And in EndCompressorLZ4()
> ```
> + required = LZ4F_compressBound(0, &state->prefs);
> + if (required > state->buflen - state->bufdata)
> + {
> + cs->writeF(AH, state->buffer, state->bufdata);
> + state->bufdata = 0;
> + }
> ```
> These two code pieces are similar, only difference is the first parameter passed to LZ4F_compressBound().
> I think we can create an inline function for it. But these are just 5 lines, I don’t have a strong option on that.
Yeah, I don't think that would improve code readability noticeably.
By the time you got done writing a documentation comment for the
new subroutine, the code would probably be longer not shorter.
I've pushed the parts of that patch set that I thought were
uncontroversial. What's left is the business about increasing
DEFAULT_IO_BUFFER_SIZE and then adjusting the tests appropriately.
So, v4-0001 attached is the previous v3-0002 to increase
DEFAULT_IO_BUFFER_SIZE, plus additions in compress_none.c to make
--compress=none also produce predictably large data blocks.
I decided that if we're going to rely on that behavior as part
of the solution for this thread's original problem, we'd better
make it happen for all compression options.
0002 adds a test case in 002_pg_dump.pl to exercise --compress=none,
because without that we don't have any coverage of the new code
0001 added in compress_none.c. That makes for a small increase
in the runtime of 002_pg_dump.pl, but I'm inclined to think it's
worth doing.
0003 modifies the existing test cases that manually compress
blobs.toc files so that they also compress toc.dat. I feel
like it's mostly an oversight that that wasn't done to begin
with; if it had been done, we'd have caught the Gzip_read bug
right away. Also, AFAICT, this doesn't cost anything measurable
in test runtime.
0004 increases the row width in the existing test case that says
it's trying to push more than DEFAULT_IO_BUFFER_SIZE through
the compressors. While I agree with the premise, this solution
is hugely expensive: it adds about 12% to the already-long runtime
of 002_pg_dump.pl. I'd like to find a better way, but ran out of
energy for today. (I think the reason this costs so much is that
it's effectively iterated hundreds of times because of
002_pg_dump.pl's more or less cross-product approach to testing
everything. Maybe we should pull it out of that structure?)
regards, tom lane
Attachment | Content-Type | Size |
---|---|---|
v4-0001-Align-the-data-block-sizes-of-pg_dump-s-various-c.patch | text/x-diff | 6.1 KB |
v4-0002-Add-a-test-case-to-cover-pg_dump-with-compress-no.patch | text/x-diff | 1.6 KB |
v4-0003-Include-compression-of-toc.dat-in-manually-compre.patch | text/x-diff | 4.1 KB |
v4-0004-Widen-the-wide-row-used-to-verify-correct-de-comp.patch | text/x-diff | 2.2 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Rishu Bagga | 2025-10-13 21:14:07 | Re: [PATCH] Write Notifications Through WAL |
Previous Message | Kirill Reshke | 2025-10-13 20:59:19 | Re: Remove custom redundant full page write description from GIN |