Re: Proper object locking for GRANT/REVOKE

From: Peter Eisentraut <peter(at)eisentraut(dot)org>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proper object locking for GRANT/REVOKE
Date: 2025-06-11 15:22:53
Message-ID: f31c4eba-a460-4b8f-8163-0d4dcddea5d4@eisentraut.org
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 09.12.24 02:25, Noah Misch wrote:
>> Ok, we should probably put that comment back in slightly altered form, like
>>
>> "XXX This function intentionally takes only an AccessShareLock ... $REASON.
>> In the face of concurrent DDL, we might easily latch
>> onto an old version of an object, causing the GRANT or REVOKE statement
>> to fail."
>
> Yep.

There is an open item for PG18 for this. So here is a patch that adds a
comment back, mostly from your descriptions in this thread.

>> The change to AccessShareLock at least prevents confusing "cache lookup
>> failed" messages, and might alleviate some security concerns about swapping
>> in a completely different object concurrently (even if we currently think
>> this is not an actual problem).
>
> Perhaps. To me, the v17 behavior smells mildly superior to the v18 behavior.

Hmm. I think there has been a general effort to get rid of internal
errors like "cache lookup failed ..." and replace them with proper
user-facing errors. This change seems in line with that.

An alternative, if we wanted to go back to the old behavior (other than
reverting altogether, since I think the refactoring is still valuable),
would be to allow get_object_address() to work with lockmode == NoLock.
That would require a bit of work, but nothing magical.

>>>> --- a/src/test/isolation/expected/intra-grant-inplace.out
>>>> +++ b/src/test/isolation/expected/intra-grant-inplace.out
>>>> @@ -248,6 +248,6 @@ relhasindex
>>>> -----------
>>>> (0 rows)
>>>> -s4: WARNING: got: cache lookup failed for relation REDACTED
>>>> +s4: WARNING: got: relation "intra_grant_inplace" does not exist
>>>
>>> The affected permutation existed to cover the first LockRelease() in
>>> SearchSysCacheLocked1(). Since this commit, that line no longer has coverage.
>>
>> Do you have an idea how such a test case could be constructed now?
>
> A rough idea. The test worked because REVOKE used only LOCKTAG_TUPLE, which
> didn't mind the LOCKTAG_RELATION from DROP TABLE.
>
> One route might be to find another SearchSysCacheLocked1() caller that takes
> no locks beyond the LOCKTAG_TUPLE taken right in SearchSysCacheLocked1(). I'd
> add a temporary elog to report if that's happening.
> check_lock_if_inplace_updateable_rel() is an example of reporting the absence
> of a lock. If check-world w/ that elog finds some operation reaching that
> circumstance, this test could replace REVOKE with that operation.
>
> Another route would be to remove the catalog row without locking the
> underlying object, e.g. by replacing DROP TABLE with DELETE FROM pg_class.
> That's more artificial.

I have attached here a patch that shows that the last idea does work.

I don't know what the best solution here is. It seems weird to leave
some higher-level code faulty in order to be able to test lower-level
code that attempts to cope with the faults of the higher-level code. I
know that backstops have value, though.

Attachment Content-Type Size
0001-Improve-objectNamesToOids-comment.patch text/plain 1.9 KB
0002-WIP-Attempt-to-put-back-intra-grant-inplace.spec-cov.patch text/plain 1.7 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Christoph Berg 2025-06-11 15:53:15 Re: CHECKPOINT unlogged data
Previous Message Dilip Kumar 2025-06-11 14:32:08 Re: Question on error code selection in conflict detection