pgsql: Align the data block sizes of pg_dump's various compression mode

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-committers(at)lists(dot)postgresql(dot)org
Subject: pgsql: Align the data block sizes of pg_dump's various compression mode
Date: 2025-10-16 16:54:28
Message-ID: E1v9RFE-0020YR-2D@gemulon.postgresql.org
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-committers

Align the data block sizes of pg_dump's various compression modes.

After commit fe8192a95, compress_zstd.c tends to produce data block
sizes around 128K, and we don't really have any control over that
unless we want to overrule ZSTD_CStreamOutSize(). Which seems like
a bad idea. But let's try to align the other compression modes to
produce block sizes roughly comparable to that, so that pg_restore's
skip-data performance isn't enormously different for different modes.

gzip compression can be brought in line simply by setting
DEFAULT_IO_BUFFER_SIZE = 128K, which this patch does. That
increases some unrelated buffer sizes, but none of them seem
problematic for modern platforms.

lz4's idea of appropriate block size is highly nonlinear:
if we just increase DEFAULT_IO_BUFFER_SIZE then the output
blocks end up around 200K. I found that adjusting the slop
factor in LZ4State_compression_init was a not-too-ugly way
of bringing that number roughly into line.

With compress = none you get data blocks the same sizes as the
table rows, which seems potentially problematic for narrow tables.
Introduce a layer of buffering to make that case match the others.

Comments in compress_io.h and 002_pg_dump.pl suggest that if
we increase DEFAULT_IO_BUFFER_SIZE then we need to increase the
amount of data fed through the tests in order to improve coverage.
I've not done that here, leaving it for a separate patch.

Author: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Discussion: https://postgr.es/m/3515357.1760128017@sss.pgh.pa.us

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/66ec01dc41243d756896777aa66df149ac8fa31d

Modified Files
--------------
src/bin/pg_dump/compress_io.h | 4 +--
src/bin/pg_dump/compress_lz4.c | 9 ++++--
src/bin/pg_dump/compress_none.c | 64 ++++++++++++++++++++++++++++++++++++++--
src/tools/pgindent/typedefs.list | 1 +
4 files changed, 72 insertions(+), 6 deletions(-)

Browse pgsql-committers by date

  From Date Subject
Next Message Álvaro Herrera 2025-10-16 18:31:01 pgsql: Fix update-po for the PGXS case
Previous Message Nathan Bossart 2025-10-16 16:32:07 pgsql: Remove partColsUpdated.