pg_dump: eliminate tmpfile double-write in tar format output

From: Josh Kupershmidt <schmiddy(at)gmail(dot)com>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: pg_dump: eliminate tmpfile double-write in tar format output
Date: 2026-04-17 00:47:00
Message-ID: CAK3UJRE_9-iQsQpYnaZFx6RPL9AUqA2wehAc7fNgiY2yhJPZig@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Please find attached a patch optimizing pg_dump's tar format (-Ft) when
writing to a seekable file. The diff here is limited to
src/bin/pg_dump/pg_backup_tar.c.

Currently, every TOC entry in the tar-format dump goes through a temporary
file: data is written to a tmpfile, then on close the tmpfile is seeked to
determine its length, the tar header is written, and the entire tmpfile
gets copied to the tar output. We end up writing the data twice: once to
the tmpfile and once to the final tar file.

The patch adds a "direct-write" mode for seekable outputs. Instead of using
a tmpfile, we write a placeholder tar header (with length 0) directly to
the tar output, stream the data after it, then seek back to rewrite the
header with the actual length. This should cut the I/O in half for the data
path.

The tmpfile path is preserved as a fallback for three cases:
1. Output is not seekable (stdout/pipe)
2. Another member is already being written directly (guard against
interleaving)
3. We're in the LO section, where the blob TOC file stays open while
individual blob data files are written and closed inside it

On a test 500K-row database (~255MB, 184MB dump file), pg_dump -Ft time
goes down from about 1.42s (master) to 1.22s (patched). The percent
improvement is a bit less for larger databases: dump time goes down from
10.24s (master) to 9.34s (patched) for a database about 10x as large.

A benchmark script (bench_tar_direct_write.sh) is included for reproducing
some of the performance testing I did.

Thanks,
Josh

Attachment Content-Type Size
bench_tar_direct_write.sh application/x-sh 2.7 KB
pg_backup_tar_mode_direct_write.diff application/octet-stream 8.8 KB

Browse pgsql-hackers by date

  From Date Subject
Next Message Tatsuo Ishii 2026-04-17 01:28:24 Re: Questionable description about character sets
Previous Message Peter Smith 2026-04-16 23:55:44 DOCS - CREATE PUBLICATION ... EXCEPT missing details on ONLY