Re: pg_upgrade fails with non-standard ACL

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: pg_upgrade fails with non-standard ACL
Date: 2019-08-14 00:28:12
Message-ID: 20190814002812.GR16436@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Greetings,

* Bruce Momjian (bruce(at)momjian(dot)us) wrote:
> On Tue, Aug 13, 2019 at 07:04:42PM +0300, Anastasia Lubennikova wrote:
> > Maybe, as a compromise, we can reset grants to default for all changed
> > functions
> > and also generate a script that will allow a user to preserve privileges of
> > the
> > old cluster by analogy with analyze_new_cluster script.
> > What do you think?
>
> I agree pg_upgrade should work without user correction as much as
> possible. However, as you can see, it can fail when user objects
> reference system table objects that have changed between major releases.

Right.

> As much as it would be nice if the release notes covered all that, and
> we updated pg_upgrade to somehow handle them, it just isn't realistic.
> As we can see here, the problems often take years to show up, and even
> then there were probably many other people who had the problem who never
> reported it to us.

Yeah, the possible changes when you think about column-level privileges
as well really gets to be quite large..

This is why my thinking is that we should come up with additional
default roles, which aren't tied to specific catalog structures but
instead are for a more general set of capabilities which we manage and
users can either decide to use, or not. If they decide to work with the
individual functions then they have to manage the upgrade process if and
when we make changes to those functions.

> I think a realistic approach is to come up with a list of all the user
> behaviors that can cause pg_upgrade to break (by reviewing previous
> pg_upgrade bug reports), and then add code to pg_upgrade to detect them
> and either fix them or report them in --check mode.

In this case, we could, at least conceptually, perform a comparison
between the different major versions and then check for any non-default
privileges for any of the objects changed and then report on those in
--check mode with a recommendation to revert to the default privileges
in the old cluster before running pg_upgrade, and then apply whatever
privileges are desired in the new cluster after the upgrade completes.

> In summary, I am saying that the odds that patch authors, committers,
> release note writers, and pg_upgrade maintainers are going to form a
> consistent work flow that catches all these changes is unrealistic ---
> our best bet is to create something in the pg_upgrade code to detects
> this. pg_upgrade already connects to the old and new cluster, so
> technically it can perform system table modification checks itself.

It'd be pretty neat if pg_upgrade could connect to the old and new
clusters concurrently and then perform a generic catalog comparison
between them and identify all objects which have been changed and
determine if there's any non-default ACLs or dependencies on the catalog
objects which are different between the clusters. That seems like an
awful lot of work though, and I'm not sure there's really any need,
given that we don't change the catalog for a given major version- we
could just generate the list using some queries whenever we release a
new major version and update pg_upgrade with it.

> The only positive is that when pg_upgrade does fail, at least we have a
> system that clearly points to the cause, but unfortunately usually at
> run-time, not at --check time.

Getting it to be at check time would certainly be a great improvement.

Solving this in pg_upgrade does seem like it's probably the better
approach rather than trying to do it in pg_dump. Unfortunately, that
likely means that all we can do is have pg_upgrade point out to the user
when something will fail, with recommendations on how to address it, but
that's also something users are likely used to and willing to accept,
and puts the onus on them to consider their ACL decisions when we're
making catalog changes, and it keeps these issues out of pg_dump.

Thanks,

Stephen

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2019-08-14 01:12:22 Re: Problem with default partition pruning
Previous Message Thomas Munro 2019-08-13 23:27:38 BF failure: could not open relation with OID XXXX while querying pg_views