ALTER DEFAULT PRIVILEGES is buggy, and so is its testing

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: ALTER DEFAULT PRIVILEGES is buggy, and so is its testing
Date: 2018-11-09 01:01:27
Message-ID: 15719.1541725287@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

Buildfarm members crake and xenodermus recently fell over with
very similar symptoms in the pg_upgrade test:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=crake&dt=2018-11-07%2021%3A47%3A29

pg_restore: [archiver (db)] Error while PROCESSING TOC:
pg_restore: [archiver (db)] Error from TOC entry 6350; 0 0 ACL SCHEMA "regress_rls_schema" andrew
pg_restore: [archiver (db)] could not execute query: ERROR: role "33757" does not exist
Command was: GRANT ALL ON SCHEMA "regress_rls_schema" TO "33757";

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=xenodermus&dt=2018-11-08%2023%3A00%3A01

pg_restore: [archiver (db)] Error while PROCESSING TOC:
pg_restore: [archiver (db)] Error from TOC entry 5357; 0 0 ACL SCHEMA "mvtest_mvschema" bf
pg_restore: [archiver (db)] could not execute query: ERROR: role "33894" does not exist
Command was: GRANT ALL ON SCHEMA "mvtest_mvschema" TO "33894";

The bogus numeric user "names" are presumably there because aclitemout
will just fall back to printing a referenced role's OID if it fails to
look up the role's name; hence, what we're looking at here is grants
made to since-dropped roles, which somehow didn't get cleaned up.

Now, looking at the regression test scripts that create these particular
schemas, that's rather mind-boggling: they never do any grant or revoke
at all on those schemas, so how'd there come to be any ACL entries?

I think that the answer is that these particular scripts run in parallel
with the "privileges" script, in which we find this:

ALTER DEFAULT PRIVILEGES GRANT ALL ON SCHEMAS TO regress_priv_user2;

What seems to be happening is

1. Given the right concurrent timing, these schemas absorb a GRANT
to regress_priv_user2 during the window in which the above is active.

2. The GRANT apparently doesn't get entered into pg_shdepend, because
when privileges.sql drops the role, there is no complaint. (It's easy
to reproduce this bug without any parallelism.)

3. The regression tests proper don't notice the dangling ACL, but
dump/restore sure does.

The reason we are just seeing this now, no doubt, is Andres having
changed the pg_upgrade test to run the core regression tests in
parallel. I'm sure it's happened a whole lot before without anyone
noticing.

So it seems like testing ALTER DEFAULT PRIVILEGES in a script that
runs in parallel with anything else is a damfool idea. But there
is also a live bug of failure to correctly record privileges granted
this way. Without that bug, we'd have had our noses rubbed in the
parallel race conditions long ago.

regards, tom lane

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2018-11-09 01:10:13 Re: ON COMMIT actions and inheritance
Previous Message Masahiko Sawada 2018-11-09 00:58:51 Re: Connection slots reserved for replication