Re: Major Version Upgrade failure due to orphan roles entries in catalog

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Álvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Virender Singla <virender(dot)cse(at)gmail(dot)com>, pgsql-bugs(at)lists(dot)postgresql(dot)org, Aniket Jha <aniketkumarj(at)gmail(dot)com>
Subject: Re: Major Version Upgrade failure due to orphan roles entries in catalog
Date: 2026-02-25 13:59:04
Message-ID: CA+TgmoauoiW4ydDhdrseg+DD4Kwha=+TSZp18BrJeHKx3o1Fdw@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Thu, Feb 20, 2025 at 5:19 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> I'm unsure whether to back-patch the 0001 patch, as it does imply
> more pg_shdepend entries than we have today, so it's sort of a
> backdoor catalog change. But we're mostly interested in the
> transient behavior of having a lock+recheck during entry insertion,
> so maybe it's fine. 0002 should be back-patched in any case.

I recently learned of a case where this commit caused role grants to
be erroneously emitted from the output of pg_dumpall. In the case in
question, a v16 pg_dumpall was used against an older server. Hence,
dump_grantors was false, and any generated GRANT commands would not
have included in the grantor anyway. Nevertheless, this logic caused
those grants to be skipped altogether:

+ if (PQgetisnull(res, i, i_grantor))
+ {
+ /* translator: %s represents a numeric role OID */
+ pg_log_warning("found orphaned pg_auth_members
entry for role %s",
+ PQgetvalue(res, i, i_grantorid));
+ done[i - start] = true;
+ --remaining;
+ continue;
+ }

I don't think this logic makes sense. In pre-16 releases, we don't
even try to maintain the grantor field properly. Consider this test
case:

create role foo;
create role bar;
create role baz createrole;
set role baz;
grant foo to bar;
reset role;
drop role baz;

If you do this on v15 and then run v15's pg_dumpall, it will dump
"GRANT foo to bar", with no GRANTOR clause due to the PQgetisnull()
gating that logic. v16's pg_dumpall will dump nothing and emit a
warning instead. Arguably, pre-v16 pg_dumpall shouldn't EVER be
dumping the grantor since the grantorid could be an old role OID that
has been recycled for a new role, and relying on that for anything
security-critical seems like a mistake, but that behavior is also
longstanding. But omitting the grant altogether seems like an
overreaction. I understand that we need to do that when the *member*
is invalid, of course; in that case, there's no alternative. But
that's not the case for the grantor.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message David G. Johnston 2026-02-25 13:59:54 Re: BugReport: PostgreSQL 17.8. Processing UTF8 encoded strings
Previous Message Anthonin Bonnefoy 2026-02-25 09:08:32 Re: BUG #18985: fast shutdown does not close connections from qlik data gateway data movement aka. replicate