| From: | PG Bug reporting form <noreply(at)postgresql(dot)org> |
|---|---|
| To: | pgsql-bugs(at)lists(dot)postgresql(dot)org |
| Cc: | huseyin(dot)d3r(at)gmail(dot)com |
| Subject: | BUG #19483: pg_upgrade fails with orphan records in pg_init_priv catalog table |
| Date: | 2026-05-18 05:06:24 |
| Message-ID: | 19483-80de42dc4e62cfd6@postgresql.org |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-bugs |
The following bug has been logged on the website:
Bug reference: 19483
Logged by: Hüseyin Demir
Email address: huseyin(dot)d3r(at)gmail(dot)com
PostgreSQL version: 18.3
Operating system: CentOS 9
Description:
Hi,
I encountered a problem during the pg_upgrade.
pg_upgrade --check passes all consistency checks but then fails during
the actual upgrade when pg_restore attempts to GRANT privileges to a
role OID (e.g. "16424") that no longer exists. The root cause is orphan
records in pg_init_privs whose ACL entries reference a dropped role by
OID. pg_upgrade does not detect these dangling references during its
pre-flight checks.
Steps to Reproduce
1. Create a role and database with that role as owner:
CREATE ROLE benchmark_owner SUPERUSER;
CREATE DATABASE my_db OWNER benchmark_owner;
2. Connect to the database and install pg_wait_sampling as that role:
\c my_db
SET ROLE benchmark_owner;
CREATE EXTENSION pg_wait_sampling;
RESET ROLE;
3. Verify pg_init_privs now contains rows with benchmark_owner in the
ACL (privtype = 'e', OIDs 16429, 16434, 16439, 16443, etc.)
4. Reassign owned objects and drop the role:
\c postgres
SELECT pg_terminate_backend(pid)
FROM pg_stat_activity WHERE datname = 'my_db';
ALTER DATABASE my_db RENAME TO my_db_v2;
\c my_db_v2
REASSIGN OWNED BY benchmark_owner TO postgres;
DROP OWNED BY benchmark_owner;
\c postgres
DROP ROLE benchmark_owner;
5. Verify orphan records remain in pg_init_privs:
SELECT pip.objoid
FROM pg_init_privs pip
CROSS JOIN LATERAL aclexplode(pip.initprivs) ace
LEFT JOIN pg_authid a ON a.oid = ace.grantee
WHERE a.oid IS NULL AND ace.grantee <> 0;
-- Returns 22 rows referencing OID 16461 (the dropped role)
6. Run pg_upgrade --check:
/usr/pgsql-18/bin/pg_upgrade \
--old-datadir=/var/lib/pgsql/14/data \
--new-datadir=/var/lib/pgsql/18/data \
--old-bindir=/usr/pgsql-14/bin/ \
--new-bindir=/usr/pgsql-18/bin/ \
--check --jobs=4
Result: "Clusters are compatible" -- all checks pass.
[postgres(at)localhost data]$ /usr/pgsql-18/bin/pg_upgrade
--old-datadir=/var/lib/pgsql/14/data --new-datadir=/var/lib/pgsql/18/data
--old-bindir=/usr/pgsql-14/bin/ --new-bindir=/usr/pgsql-18/bin/ --check
--jobs=4
Performing Consistency Checks on Old Live Server
------------------------------------------------
Checking cluster versions ok
Checking database connection settings ok
Checking database user is the install user ok
Checking for prepared transactions ok
Checking for contrib/isn with bigint-passing mismatch ok
Checking data type usage ok
Checking for not-null constraint inconsistencies ok
Checking for presence of required libraries ok
Checking database user is the install user ok
Checking for prepared transactions ok
Checking for new cluster tablespace directories ok
*Clusters are compatible*
7. Run pg_upgrade (without --check):
[postgres(at)localhost data]$ /usr/pgsql-18/bin/pg_upgrade
--old-datadir=/var/lib/pgsql/14/data --new-datadir=/var/lib/pgsql/18/data
--old-bindir=/usr/pgsql-14/bin/ --new-bindir=/usr/pgsql-18/bin/ --jobs=4
Performing Consistency Checks
-----------------------------
Checking cluster versions ok
Checking database connection settings ok
Checking database user is the install user ok
Checking for prepared transactions ok
Checking for contrib/isn with bigint-passing mismatch ok
Checking data type usage ok
Checking for not-null constraint inconsistencies ok
Creating dump of global objects ok
Creating dump of database schemas
ok
Checking for presence of required libraries ok
Checking database user is the install user ok
Checking for prepared transactions ok
Checking for new cluster tablespace directories ok
If pg_upgrade fails after this point, you must re-initdb the
new cluster before continuing.
Performing Upgrade
------------------
Setting locale and encoding for new cluster ok
Analyzing all rows in the new cluster ok
Freezing all rows in the new cluster ok
Deleting files from new pg_xact ok
Copying old pg_xact to new server ok
Setting oldest XID for new cluster ok
Setting next transaction ID and epoch for new cluster ok
Deleting files from new pg_multixact/offsets ok
Copying old pg_multixact/offsets to new server ok
Deleting files from new pg_multixact/members ok
Copying old pg_multixact/members to new server ok
Setting next multixact ID and offset for new cluster ok
Resetting WAL archives ok
Setting the default char signedness for new cluster ok
Setting frozenxid and minmxid counters in new cluster ok
Restoring global objects in the new cluster ok
Restoring database schemas in the new cluster
my_db_v2
*failure*
Consult the last few lines of
"/var/lib/pgsql/18/data/pg_upgrade_output.d/20260518T065340.119/log/pg_upgrade_dump_16425.log"
for
the probable cause of the failure.
Failure, exiting
*failure*
Consult the last few lines of
"/var/lib/pgsql/18/data/pg_upgrade_output.d/20260518T065340.119/log/pg_upgrade_dump_16384.log"
for
the probable cause of the failure.
Failure, exiting
child process exited abnormally: status 256
Failure, exiting
[postgres(at)localhost data]$
The pg_upgrade log as follow
[postgres(at)localhost data]$ cat
/var/lib/pgsql/18/data/pg_upgrade_output.d/20260518T065340.119/log/pg_upgrade_dump_16425.log
command: "/usr/pgsql-18/bin/pg_dump" --host /var/lib/pgsql/18/data --port
50432 --username postgres --no-data --sequence-data --quote-all-identifiers
--binary-upgrade --format=custom --statistics --no-sync
--file="/var/lib/pgsql/18/data/pg_upgrade_output.d/20260518T065340.119/dump/pg_upgrade_dump_16425.custom"
'dbname=my_db_v2' >>
"/var/lib/pgsql/18/data/pg_upgrade_output.d/20260518T065340.119/log/pg_upgrade_dump_16425.log"
2>&1
command: "/usr/pgsql-18/bin/pg_restore" --host /var/lib/pgsql/18/data --port
50432 --username postgres --create --exit-on-error --verbose
--transaction-size=250 --dbname template1
"/var/lib/pgsql/18/data/pg_upgrade_output.d/20260518T065340.119/dump/pg_upgrade_dump_16425.custom"
>>
"/var/lib/pgsql/18/data/pg_upgrade_output.d/20260518T065340.119/log/pg_upgrade_dump_16425.log"
2>&1
pg_restore: connecting to database for restore
pg_restore: executing SELECT pg_catalog.set_config('search_path', '',
false);
pg_restore: creating DATABASE "my_db_v2"
pg_restore: connecting to new database "my_db_v2"
pg_restore: executing SELECT pg_catalog.set_config('search_path', '',
false);
pg_restore: creating DATABASE PROPERTIES "my_db_v2"
pg_restore: connecting to new database "my_db_v2"
pg_restore: executing SELECT pg_catalog.set_config('search_path', '',
false);
pg_restore: creating pg_largeobject "pg_largeobject"
pg_restore: creating SCHEMA "public"
pg_restore: creating COMMENT "SCHEMA "public""
pg_restore: creating EXTENSION "pg_wait_sampling"
pg_restore: creating COMMENT "EXTENSION "pg_wait_sampling""
pg_restore: creating FUNCTION "public.pg_wait_sampling_get_current(integer)"
pg_restore: creating FUNCTION "public.pg_wait_sampling_get_history()"
pg_restore: creating FUNCTION "public.pg_wait_sampling_get_profile()"
pg_restore: creating FUNCTION "public.pg_wait_sampling_reset_profile()"
pg_restore: creating VIEW "public.pg_wait_sampling_current"
pg_restore: creating VIEW "public.pg_wait_sampling_history"
pg_restore: creating VIEW "public.pg_wait_sampling_profile"
pg_restore: creating ACL "SCHEMA "public""
pg_restore: creating ACL "public.FUNCTION
"pg_wait_sampling_reset_profile"()"
pg_restore: while PROCESSING TOC:
pg_restore: from TOC entry 4338; 0 0 ACL FUNCTION
"pg_wait_sampling_reset_profile"() postgres
pg_restore: error: could not execute query: ERROR: role "16424" does not
exist
Command was: SELECT pg_catalog.binary_upgrade_set_record_init_privs(true);
REVOKE ALL ON FUNCTION "public"."pg_wait_sampling_reset_profile"() FROM
PUBLIC;
REVOKE ALL ON FUNCTION "public"."pg_wait_sampling_reset_profile"() FROM
"postgres";
SET SESSION AUTHORIZATION "16424";
GRANT ALL ON FUNCTION "public"."pg_wait_sampling_reset_profile"() TO
"16424";
RESET SESSION AUTHORIZATION;
SELECT pg_catalog.binary_upgrade_set_record_init_privs(false);
REVOKE ALL ON FUNCTION "public"."pg_wait_sampling_reset_profile"() FROM
"16424";
GRANT ALL ON FUNCTION "public"."pg_wait_sampling_reset_profile"() TO
"postgres";
Root Cause
When a role is dropped, DROP OWNED BY and REASSIGN OWNED BY do NOT
cascade to pg_init_privs. This leaves ACL entries in pg_init_privs
referencing the dropped role by its numeric OID.
pg_dump (run internally by pg_upgrade) faithfully dumps these orphan
ACL entries and emits SET SESSION AUTHORIZATION "<OID>" commands.
pg_restore then fails because the numeric OID is not a valid role name
in the new cluster.
pg_upgrade's --check phase does not include a check for orphan role
references inside pg_init_privs ACLs, so the problem is not caught
before the destructive upgrade steps begin.
Expected Behavior
pg_upgrade --check should detect orphan ACL entries in pg_init_privs
(where ace.grantee does not match any pg_authid.oid and is not 0) and
report an error before declaring the clusters compatible.
Possible Fix
Add a new pre-flight check in pg_upgrade (src/bin/pg_upgrade/check.c)
that runs the following query against each user database in the old
cluster:
SELECT DISTINCT pip.objoid
FROM pg_init_privs pip
CROSS JOIN LATERAL aclexplode(pip.initprivs) ace
LEFT JOIN pg_authid a ON a.oid = ace.grantee
WHERE a.oid IS NULL AND ace.grantee <> 0;
If any rows are returned, pg_upgrade should abort with a descriptive
error message telling the user to either drop or clean up the orphan
ACL entries before upgrading.
Workaround
Before running pg_upgrade, manually clean up orphan pg_init_privs
entries in all user databases:
-- Identify affected objects
SELECT pip.objoid
FROM pg_init_privs pip
CROSS JOIN LATERAL aclexplode(pip.initprivs) ace
LEFT JOIN pg_authid a ON a.oid = ace.grantee
WHERE a.oid IS NULL AND ace.grantee <> 0;
-- Remove the orphan rows (adjust WHERE clause as needed)
DELETE FROM pg_init_privs
WHERE objoid IN (
SELECT pip.objoid
FROM pg_init_privs pip
CROSS JOIN LATERAL aclexplode(pip.initprivs) ace
LEFT JOIN pg_authid a ON a.oid = ace.grantee
WHERE a.oid IS NULL AND ace.grantee <> 0
);
Note: Direct modification of system catalogs requires
allow_system_table_mods = on.
How should we proceed or solve this case ?
| From | Date | Subject | |
|---|---|---|---|
| Next Message | PG Bug reporting form | 2026-05-18 06:38:34 | BUG #19484: Segmentation fault triggered by FDW |
| Previous Message | Japin Li | 2026-05-18 03:10:04 | Re: BUG #19478: `dblink_close` can be used for injection. |