Re: [PATCH] pg_dump: Restore extension config table data before user objects during pg_upgrade

From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Jimmy Angelakos <jimmy(dot)angelakos(at)pgedge(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH] pg_dump: Restore extension config table data before user objects during pg_upgrade
Date: 2026-03-29 18:34:39
Message-ID: dafc934b-f8b2-4b49-8776-703e6d1320be@dunslane.net
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


On 2026-03-20 Fr 1:47 PM, Jimmy Angelakos wrote:
> Hi All,
>
> I ran into this issue when pg_upgrade-ing a DB with PostGIS. This is
> my first code patch, so any feedback on the approach will be appreciated!
>
> The problem:
> ============
> pg_upgrade uses pg_dump --schema-only --binary-upgrade to copy the
> schema from $oldcluster to $newcluster. Because this excludes all
> table data, it leaves out data in extension config tables registered
> with pg_extension_config_dump().
>
> In $newcluster, binary_upgrade_create_empty_extension() creates the
> extensions without populating any table data. The extensions' CREATE
> EXTENSION scripts never get executed so any INSERTs are skipped. As a
> consequence, if any CREATE TABLE statement in $newcluster requires
> validation against these empty config tables, the upgrade fails. As an
> example,
> PostGIS registers config table spatial_ref_sys to hold ~8500 spatial
> reference system definitions (SRIDs). When a table has, e.g. a
> geometry column that specifies an SRID, this gets validated during the
> CREATE TABLE:
>
> CREATE TABLE points (id int, location geometry(Point, 27700));
> ERROR:  Cannot find SRID (27700) in spatial_ref_sys
>
> This will happen for any SRID-constrained column, which will prevent
> many real-world PostGIS deployments from being able to pg_upgrade. To
> summarise the problem, our ordering is wrong here because extension
> configuration data must be present before user tables that depend on
> it get created, but --schema-only strips this data.
>
> The patch:
> ==========
> We are adding a new dump object type DO_EXTENSION_DATA that dumps
> extension config table data in SECTION_PRE_DATA during
> --binary-upgrade ONLY. This restores the needed data between extension
> creation and user object creation, allowing the DDL to succeed.
>
> Four files are modified in bin/pg_dump:
>
> pg_dump.h:
> Add DO_EXTENSION_DATA to the DumpableObjectType enum, between
> DO_EXTENSION and DO_TYPE
>
> pg_dump_sort.c:
> Add PRIO_EXTENSION_DATA between PRIO_EXTENSION and PRIO_TYPE
>
> pg_dump.c:
> 1. Add makeExtensionDataInfo() to create a TableDataInfo with objType
> = DO_EXTENSION_DATA. Called for plain tables (RELKIND_RELATION) during
> --binary-upgrade ONLY. As it depends on the table def, the COPY will
> be emitted after the CREATE TABLE.
> 2. Add dumpExtensionData() to emit the entry in SECTION_PRE_DATA with
> description "EXTENSION DATA" using dumpTableData_copy(). This allows
> the config table data to go into the schema-only dump.
> 3. In processExtensionTables(), when dopt->binary_upgrade is true,
> call makeExtensionDataInfo() instead of makeTableDataInfo().
> Additionally, skip extcondition filter because we need to dump all
> rows here.
> 4. Include DO_EXTENSION_DATA in pre-data boundary in
> addBoundaryDependencies()
>
> pg_backup_archiver.c:
> Add "EXTENSION DATA" to the whitelist in _tocEntryRequired() similar
> to BLOB, BLOB METADATA, etc. to include extension config table data in
> --schema-only dumps during --binary-upgrade ONLY.
>
> What ends up happening:
> =======================
> The inserted rows are basically scaffolding to allow the upgrade, and
> do not persist. The pg_upgrade sequence goes like:
> 1. pg_dump includes $oldcluster extension config data in schema-only dump
> 2. pg_restore replays the dump into $newcluster and "EXTENSION DATA"
> entries populate tables like spatial_ref_sys with COPY. Subsequent
> CREATE TABLEs with e.g. SRID-constrained columns pass validation.
> 3. pg_upgrade transfers all data files from $oldcluster to
> $newcluster, making spatial_ref_sys byte-for-byte identical to its
> previous state.
>
> This patch:
> 1. Does NOT affect normal pg_dumps (without --binary-upgrade).
> DO_EXTENSION_DATA objects are not created in this case.
> 2. Leaves binary_upgrade_create_empty_extension() unchanged.
> 3. Is not PostGIS-specific, and should solve this class of problem for
> any extension that registers config tables that will be needed for DDL
> validation.
> 4. Has been tested against HEAD at 29bf4ee7496 with $oldcluster
> PostGIS 3.3.9 on PG14 and $newcluster PostGIS 3.7.0dev/master on
> PG19-devel.
>
> Thanks in advance for your review! Please find attached the patch for
> HEAD. I believe this should be easily backpatchable to (at least)
> PG15, and will be happy to work on backports.

Hi, Jimmy.

First, as you probably know, we don't backpatch features, and I think
this comes into that category. Unfortunately, we're about to close
release 19 for features, so this would need to wait till release 20.

The patch didn't include any tests. It will need them (probably in
src/test/modules/test_pg_dump)

There appears to be a lot of code duplication between
dumpExtensionData() and dumpTableData(). It might be better to refactor
that, perhaps by supplying an extra flag to dumpTableData().

Do make sure to add a Commitfest entry for this is you haven't already
done so.

cheers

andrew

>
> Best regards,
> Jimmy

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jim Jones 2026-03-29 18:53:46 Re: Add max_wal_replay_size connection parameter to libpq
Previous Message Tom Lane 2026-03-29 18:33:44 Re: docs: warn about post-data-only schema dumps with parallel restore.