Re: pg_upgrade fails with non-standard ACL

From: Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>
To: Stephen Frost <sfrost(at)snowman(dot)net>, Bruce Momjian <bruce(at)momjian(dot)us>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: pg_upgrade fails with non-standard ACL
Date: 2019-08-20 13:38:18
Message-ID: 392ca335-068d-7bd3-0ad8-fdf0a45d95d4@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

14.08.2019 3:28, Stephen Frost wrote:
> * Bruce Momjian (bruce(at)momjian(dot)us) wrote:
>
>> As much as it would be nice if the release notes covered all that, and
>> we updated pg_upgrade to somehow handle them, it just isn't realistic.
>> As we can see here, the problems often take years to show up, and even
>> then there were probably many other people who had the problem who never
>> reported it to us.
> Yeah, the possible changes when you think about column-level privileges
> as well really gets to be quite large..
>
> This is why my thinking is that we should come up with additional
> default roles, which aren't tied to specific catalog structures but
> instead are for a more general set of capabilities which we manage and
> users can either decide to use, or not. If they decide to work with the
> individual functions then they have to manage the upgrade process if and
> when we make changes to those functions.

Idea of having special roles looks good to me, though, I don't see
how to define what grants are needed for each role. Let's say, we
define role "backupuser" that obviously must have grants on
pg_start_backup()
and pg_stop_backup(). Should it also have access to pg_is_in_recovery()
or for example version()?

> It'd be pretty neat if pg_upgrade could connect to the old and new
> clusters concurrently and then perform a generic catalog comparison
> between them and identify all objects which have been changed and
> determine if there's any non-default ACLs or dependencies on the catalog
> objects which are different between the clusters. That seems like an
> awful lot of work though, and I'm not sure there's really any need,
> given that we don't change the catalog for a given major version- we
> could just generate the list using some queries whenever we release a
> new major version and update pg_upgrade with it.
>
>> The only positive is that when pg_upgrade does fail, at least we have a
>> system that clearly points to the cause, but unfortunately usually at
>> run-time, not at --check time.
> Getting it to be at check time would certainly be a great improvement.
>
> Solving this in pg_upgrade does seem like it's probably the better
> approach rather than trying to do it in pg_dump. Unfortunately, that
> likely means that all we can do is have pg_upgrade point out to the user
> when something will fail, with recommendations on how to address it, but
> that's also something users are likely used to and willing to accept,
> and puts the onus on them to consider their ACL decisions when we're
> making catalog changes, and it keeps these issues out of pg_dump.

I wrote a prototype to check API and ACL compatibility (see attachment).
In the current implementation it fetches the list of system procedures
for both old and new clusters
and reports deleted functions or ACL changes during pg_upgrade check:

Performing Consistency Checks
-----------------------------
...
Checking for system functions API compatibility
dbname postgres : check procsig is equal pg_stop_backup(), procacl not
equal {anastasia=X/anastasia,backup=X/anastasia} vs {anastasia=X/anastasia}
dbname postgres : procedure pg_stop_backup(exclusive boolean, OUT lsn
pg_lsn, OUT labelfile text, OUT spcmapfile text) doesn't exist in
new_cluster
dbname postgres : procedure pg_switch_xlog() doesn't exist in new_cluster
dbname postgres : procedure pg_xlog_replay_pause() doesn't exist in
new_cluster
dbname postgres : procedure pg_xlog_replay_resume() doesn't exist in
new_cluster
...

I think it's a good first step in the right direction.
Now I have questions about implementation details:

1) How exactly should we report this incompatibility to a user?
I think it's fine to leave the warnings and also write some hint for the
user by analogy with other checks.
"Reset ACL on the problem functions to default in the old cluster to
continue"

Is it enough?

2) This approach can be extended to other catalog modifications, you
mentioned above.
I don't see what else can break pg_upgrade in the same way. However, I
don't mind
implementing more checks, while I work on this issue, if you can point
on them.

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment Content-Type Size
pg_upgrade_ACL_check_v1.patch text/x-patch 6.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2019-08-20 13:47:22 Re: Cleanup isolation specs from unused steps
Previous Message Robert Haas 2019-08-20 13:08:29 Re: POC: Cleaning up orphaned files using undo logs