[PATCH v1] Allow excluded tables to remain in the pg_restore data-loading path for directory-format dumps

From: Kirk Wolak <wolakk(at)gmail(dot)com>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Cc: Nikolay Samokhvalov <nik(at)postgres(dot)ai>, "lepihov(at)gmail(dot)com" <lepihov(at)gmail(dot)com>, Andrey Borodin <x4mmm(at)yandex-team(dot)ru>
Subject: [PATCH v1] Allow excluded tables to remain in the pg_restore data-loading path for directory-format dumps
Date: 2026-06-22 22:17:14
Message-ID: CACLU5mTrtt2ocj_UhVC2_4tjXRi7oMCxK17WjoeX7khwRvZjxQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hackers,

This patch adds a directory-format pg_dump option that preserves TABLE DATA
restore entries for excluded tables by writing empty COPY placeholder
files, allowing those files to be replaced before pg_restore.

While over 500 lines changed. Only 79 Lines of code were added, 5 removed!

Almost half is tests for various edge cases!

Motivation

Our pg_dumps of critical data finishes in under 20 minutes. We exclude 6
very large tables. But this dump is then partially useful. For Staging,
we then have a manual process to pull in limited amounts of data on those 6
tables, and re-index. It's painful. With this patch, we can run 6 simple
COPY commands of those tables (limiting to recent data), replacing the
NNN.dat files and voila, a normal pg_restore and we are finished!

With this option, excluded tables still appear in the archive as TABLE DATA
entries with the usual COPY statement, and a per-table data file is created
containing only the COPY end marker (\.). Before restore, those placeholder
files can be replaced with externally generated COPY data. pg_restore then
loads the tables in the normal order, including loading data before
dependent index creation, without extra post-restore steps.

The patch

The new option is --create-table-data-placeholders. It requires
--exclude-table-data or --exclude-table-data-and-children, and is supported
only for directory-format dumps (-Fd) that use COPY data (the default). It
is rejected for --inserts, --column-inserts, --rows-per-insert,
--schema-only, and --no-data.

For excluded ordinary tables, pg_dump now emits the TABLE DATA TOC entry
and creates the numbered .dat file, but does not copy table rows. Sequences
and materialized views are unchanged: they keep the existing exclusion
behavior, since they are not restored from replaceable table data files in
this workflow.

Documentation and a TAP test (src/bin/pg_dump/t/
012_pg_dump_empty_excluded_data.pl) are included.
There are 28 individual TAP assertions in one new test file!

Testing

The new TAP test covers option validation, placeholder file contents,
restore of both included and excluded tables, and confirmation that
excluded sequences and materialized views are not affected by the flag.

Comments welcome. I am happy to adjust naming, documentation, or scope
based on feedback.

Kirk

Attachment Content-Type Size
v1-0001-pg-dump-table-data-placeholders.patch application/octet-stream 18.6 KB

Browse pgsql-hackers by date

  From Date Subject
Next Message Tristan Partin 2026-06-22 22:30:10 Re: [PATCH] Warn when io_min_workers exceeds io_max_workers
Previous Message Roberto Mello 2026-06-22 22:10:53 Re: The PostgreSQL C Dialect