| From: | Kirk Wolak <wolakk(at)gmail(dot)com> |
|---|---|
| To: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
| Cc: | Nikolay Samokhvalov <nik(at)postgres(dot)ai>, "lepihov(at)gmail(dot)com" <lepihov(at)gmail(dot)com>, Andrey Borodin <x4mmm(at)yandex-team(dot)ru> |
| Subject: | [PATCH v1] Allow excluded tables to remain in the pg_restore data-loading path for directory-format dumps |
| Date: | 2026-06-22 22:17:14 |
| Message-ID: | CACLU5mTrtt2ocj_UhVC2_4tjXRi7oMCxK17WjoeX7khwRvZjxQ@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hackers,
This patch adds a directory-format pg_dump option that preserves TABLE DATA
restore entries for excluded tables by writing empty COPY placeholder
files, allowing those files to be replaced before pg_restore.
While over 500 lines changed. Only 79 Lines of code were added, 5 removed!
Almost half is tests for various edge cases!
Motivation
Our pg_dumps of critical data finishes in under 20 minutes. We exclude 6
very large tables. But this dump is then partially useful. For Staging,
we then have a manual process to pull in limited amounts of data on those 6
tables, and re-index. It's painful. With this patch, we can run 6 simple
COPY commands of those tables (limiting to recent data), replacing the
NNN.dat files and voila, a normal pg_restore and we are finished!
With this option, excluded tables still appear in the archive as TABLE DATA
entries with the usual COPY statement, and a per-table data file is created
containing only the COPY end marker (\.). Before restore, those placeholder
files can be replaced with externally generated COPY data. pg_restore then
loads the tables in the normal order, including loading data before
dependent index creation, without extra post-restore steps.
The patch
The new option is --create-table-data-placeholders. It requires
--exclude-table-data or --exclude-table-data-and-children, and is supported
only for directory-format dumps (-Fd) that use COPY data (the default). It
is rejected for --inserts, --column-inserts, --rows-per-insert,
--schema-only, and --no-data.
For excluded ordinary tables, pg_dump now emits the TABLE DATA TOC entry
and creates the numbered .dat file, but does not copy table rows. Sequences
and materialized views are unchanged: they keep the existing exclusion
behavior, since they are not restored from replaceable table data files in
this workflow.
Documentation and a TAP test (src/bin/pg_dump/t/
012_pg_dump_empty_excluded_data.pl) are included.
There are 28 individual TAP assertions in one new test file!
Testing
The new TAP test covers option validation, placeholder file contents,
restore of both included and excluded tables, and confirmation that
excluded sequences and materialized views are not affected by the flag.
Comments welcome. I am happy to adjust naming, documentation, or scope
based on feedback.
Kirk
| Attachment | Content-Type | Size |
|---|---|---|
| v1-0001-pg-dump-table-data-placeholders.patch | application/octet-stream | 18.6 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Tristan Partin | 2026-06-22 22:30:10 | Re: [PATCH] Warn when io_min_workers exceeds io_max_workers |
| Previous Message | Roberto Mello | 2026-06-22 22:10:53 | Re: The PostgreSQL C Dialect |