| From: | Jimmy Angelakos <jimmy(dot)angelakos(at)pgedge(dot)com> |
|---|---|
| To: | pgsql-hackers(at)postgresql(dot)org |
| Subject: | [PATCH] pg_dump: Restore extension config table data before user objects during pg_upgrade |
| Date: | 2026-03-20 17:47:05 |
| Message-ID: | CAA0-ca0sPhggS1EK25FmpFh_efbT=ZSZWgEOyKzHiUEC27LPnA@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi All,
I ran into this issue when pg_upgrade-ing a DB with PostGIS. This is my
first code patch, so any feedback on the approach will be appreciated!
The problem:
============
pg_upgrade uses pg_dump --schema-only --binary-upgrade to copy the schema
from $oldcluster to $newcluster. Because this excludes all table data, it
leaves out data in extension config tables registered with
pg_extension_config_dump().
In $newcluster, binary_upgrade_create_empty_extension() creates the
extensions without populating any table data. The extensions' CREATE
EXTENSION scripts never get executed so any INSERTs are skipped. As a
consequence, if any CREATE TABLE statement in $newcluster requires
validation against these empty config tables, the upgrade fails. As an
example,
PostGIS registers config table spatial_ref_sys to hold ~8500 spatial
reference system definitions (SRIDs). When a table has, e.g. a geometry
column that specifies an SRID, this gets validated during the CREATE TABLE:
CREATE TABLE points (id int, location geometry(Point, 27700));
ERROR: Cannot find SRID (27700) in spatial_ref_sys
This will happen for any SRID-constrained column, which will prevent many
real-world PostGIS deployments from being able to pg_upgrade. To summarise
the problem, our ordering is wrong here because extension configuration
data must be present before user tables that depend on it get created, but
--schema-only strips this data.
The patch:
==========
We are adding a new dump object type DO_EXTENSION_DATA that dumps extension
config table data in SECTION_PRE_DATA during --binary-upgrade ONLY. This
restores the needed data between extension creation and user object
creation, allowing the DDL to succeed.
Four files are modified in bin/pg_dump:
pg_dump.h:
Add DO_EXTENSION_DATA to the DumpableObjectType enum, between DO_EXTENSION
and DO_TYPE
pg_dump_sort.c:
Add PRIO_EXTENSION_DATA between PRIO_EXTENSION and PRIO_TYPE
pg_dump.c:
1. Add makeExtensionDataInfo() to create a TableDataInfo with objType =
DO_EXTENSION_DATA. Called for plain tables (RELKIND_RELATION) during
--binary-upgrade ONLY. As it depends on the table def, the COPY will be
emitted after the CREATE TABLE.
2. Add dumpExtensionData() to emit the entry in SECTION_PRE_DATA with
description "EXTENSION DATA" using dumpTableData_copy(). This allows the
config table data to go into the schema-only dump.
3. In processExtensionTables(), when dopt->binary_upgrade is true, call
makeExtensionDataInfo() instead of makeTableDataInfo(). Additionally, skip
extcondition filter because we need to dump all rows here.
4. Include DO_EXTENSION_DATA in pre-data boundary in
addBoundaryDependencies()
pg_backup_archiver.c:
Add "EXTENSION DATA" to the whitelist in _tocEntryRequired() similar to
BLOB, BLOB METADATA, etc. to include extension config table data in
--schema-only dumps during --binary-upgrade ONLY.
What ends up happening:
=======================
The inserted rows are basically scaffolding to allow the upgrade, and do
not persist. The pg_upgrade sequence goes like:
1. pg_dump includes $oldcluster extension config data in schema-only dump
2. pg_restore replays the dump into $newcluster and "EXTENSION DATA"
entries populate tables like spatial_ref_sys with COPY. Subsequent CREATE
TABLEs with e.g. SRID-constrained columns pass validation.
3. pg_upgrade transfers all data files from $oldcluster to $newcluster,
making spatial_ref_sys byte-for-byte identical to its previous state.
This patch:
1. Does NOT affect normal pg_dumps (without --binary-upgrade).
DO_EXTENSION_DATA objects are not created in this case.
2. Leaves binary_upgrade_create_empty_extension() unchanged.
3. Is not PostGIS-specific, and should solve this class of problem for any
extension that registers config tables that will be needed for DDL
validation.
4. Has been tested against HEAD at 29bf4ee7496 with $oldcluster PostGIS
3.3.9 on PG14 and $newcluster PostGIS 3.7.0dev/master on PG19-devel.
Thanks in advance for your review! Please find attached the patch for HEAD.
I believe this should be easily backpatchable to (at least) PG15, and will
be happy to work on backports.
Best regards,
Jimmy
| Attachment | Content-Type | Size |
|---|---|---|
| 0001-pg_dump-Restore-extension-config-table-data-before-u.patch | text/x-patch | 9.8 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Jacob Champion | 2026-03-20 17:52:10 | Re: unclear OAuth error message |
| Previous Message | Jacob Champion | 2026-03-20 17:46:46 | Re: Custom oauth validator options |