Re: fixing CREATEROLE

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>, walther(at)technowledgy(dot)de, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: fixing CREATEROLE
Date: 2025-05-20 18:32:11
Message-ID: CA+TgmobtKXaJksQXpPa9nktEhFgL-ON2XJVzssr8L0A5JtVd1w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Apr 30, 2025 at 4:29 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> However, I'd at least like to complain about the fact that
> it breaks pg_dumpall, which is surely not expecting anything
> but the default behavior. If for any reason the restore is
> run under a non-default setting of createrole_self_grant,
> there's a potential of creating role grants that were not
> there in the source database. Admittedly the damage is
> probably limited by the fact that it only applies if the
> restoring user has CREATEROLE but not SUPERUSER, which
> I imagine is a rare case. But don't we need to add
> createrole_self_grant to the set of GUCs that pg_dump[all]
> resets in the emitted SQL?

I spent some time on this today. As you might imagine, it's quite easy
to make pg_dumpall emit SET createrole_self_grant = '', but doing so
seems less useful than I had expected. I wonder if I'm missing
something important here, so I thought I'd better inquire before
proceeding further.

As I see it, the core difficulty is that the output of pg_dumpall
always contains a CREATE ROLE statement for every single role in the
system, even the bootstrap superuser. For starters, that means that if
you simply run initdb and create cluster #1, run initdb again and
create cluster #2, dump the first and restore to the second, you will
get an error, because the same bootstrap superuser will exist in both
systems, so when you feed the output of pg_dumpall to psql, you end up
trying to create a role that already exists. At this point, my head is
already kind of exploding, because I thought we were pretty careful to
try to make it so that pg_dump output can be restored without error
even in the face of pre-existing objects like the public schema and
the plpgsql language, but apparently we haven't applied the same
principle to pg_dumpall.[1]

But if, as you posit above, we were to try running the output of
pg_dumpall through psql as a non-superuser, the problem is a whole lot
worse. You can imagine a pg_dumpall feature that only tries to dump
(on the source system) roles that the dumping user can administer, and
only tries to recreate those roles on the target system, but we
haven't got that feature, so we're going to try to recreate every
single source role on the target system, including the bootstrap user
and the non-superuser who is restoring the dump if they exist on the
source side and any other superusers and any other users created by
other CREATEROLE superusers and it seems to me that under any set of
somewhat-reasonable assumptions you're going to expect a bunch of
error messages to start showing up at this point. In short, trying to
restore pg_dumpall output as a non-superuser appears to be an
unsupported scenario, so the fact that we don't SET
createrole_self_grant = '' to cater to it doesn't really seem like a
bug to many any more.

In fact, I think there's a decent argument that we ought to let the
prevailing value of createrole_self_grant take effect in this
scenario. One pretty likely scenario, at least as it seems to me, is
that the user was superuser on the source system and is not superuser
on the target system but wants to recreate the same set of roles. If
they want to freely access objects owned by those roles as they could
on the source system, then they're going to need self-grants, and we
have no better clue than the value of createrole_self_grant to help us
figure out whether they want that or not.

To state the concern another way, if this is a bug, it should be
possible to construct a test case that fails without the patch and
passes with the patch. But it appears to me that the only way I could
do that is if I programatically edit the dump. And that seems like
cheating, because if we are talking about a scenario where the user is
editing the dump, they can also add SET createrole_self_grant = '' if
desired.

I don't want to make it sound like I now hate the idea of doing as you
proposed here, because I do see the point of nailing down critical
GUCs that can affect the interpretation of SQL statements in places
like pg_dumpall output, and maybe we should do that here ... kinda
just in case? But I'm not altogether sure that's a sufficient
justification, and at any rate I think we need to be clear on whether
that *is* the justification or whether there's something more concrete
that we're trying to make work.

--
Robert Haas
EDB: http://www.enterprisedb.com

[1] Exception: When --binary-upgrade is used, we emit only ALTER ROLE
and not CREATE ROLE for the bootstrap superuser. Why we think the
error is only worth avoiding in the --binary-upgrade case is unclear
to me.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Gustafsson 2025-05-20 18:47:36 Re: proposal: schema variables
Previous Message Andres Freund 2025-05-20 18:23:25 Re: Violation of principle that plan trees are read-only