From: | Peter Eisentraut <peter(at)eisentraut(dot)org> |
---|---|
To: | Noah Misch <noah(at)leadboat(dot)com> |
Cc: | Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Proper object locking for GRANT/REVOKE |
Date: | 2025-06-11 15:22:53 |
Message-ID: | f31c4eba-a460-4b8f-8163-0d4dcddea5d4@eisentraut.org |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 09.12.24 02:25, Noah Misch wrote:
>> Ok, we should probably put that comment back in slightly altered form, like
>>
>> "XXX This function intentionally takes only an AccessShareLock ... $REASON.
>> In the face of concurrent DDL, we might easily latch
>> onto an old version of an object, causing the GRANT or REVOKE statement
>> to fail."
>
> Yep.
There is an open item for PG18 for this. So here is a patch that adds a
comment back, mostly from your descriptions in this thread.
>> The change to AccessShareLock at least prevents confusing "cache lookup
>> failed" messages, and might alleviate some security concerns about swapping
>> in a completely different object concurrently (even if we currently think
>> this is not an actual problem).
>
> Perhaps. To me, the v17 behavior smells mildly superior to the v18 behavior.
Hmm. I think there has been a general effort to get rid of internal
errors like "cache lookup failed ..." and replace them with proper
user-facing errors. This change seems in line with that.
An alternative, if we wanted to go back to the old behavior (other than
reverting altogether, since I think the refactoring is still valuable),
would be to allow get_object_address() to work with lockmode == NoLock.
That would require a bit of work, but nothing magical.
>>>> --- a/src/test/isolation/expected/intra-grant-inplace.out
>>>> +++ b/src/test/isolation/expected/intra-grant-inplace.out
>>>> @@ -248,6 +248,6 @@ relhasindex
>>>> -----------
>>>> (0 rows)
>>>> -s4: WARNING: got: cache lookup failed for relation REDACTED
>>>> +s4: WARNING: got: relation "intra_grant_inplace" does not exist
>>>
>>> The affected permutation existed to cover the first LockRelease() in
>>> SearchSysCacheLocked1(). Since this commit, that line no longer has coverage.
>>
>> Do you have an idea how such a test case could be constructed now?
>
> A rough idea. The test worked because REVOKE used only LOCKTAG_TUPLE, which
> didn't mind the LOCKTAG_RELATION from DROP TABLE.
>
> One route might be to find another SearchSysCacheLocked1() caller that takes
> no locks beyond the LOCKTAG_TUPLE taken right in SearchSysCacheLocked1(). I'd
> add a temporary elog to report if that's happening.
> check_lock_if_inplace_updateable_rel() is an example of reporting the absence
> of a lock. If check-world w/ that elog finds some operation reaching that
> circumstance, this test could replace REVOKE with that operation.
>
> Another route would be to remove the catalog row without locking the
> underlying object, e.g. by replacing DROP TABLE with DELETE FROM pg_class.
> That's more artificial.
I have attached here a patch that shows that the last idea does work.
I don't know what the best solution here is. It seems weird to leave
some higher-level code faulty in order to be able to test lower-level
code that attempts to cope with the faults of the higher-level code. I
know that backstops have value, though.
Attachment | Content-Type | Size |
---|---|---|
0001-Improve-objectNamesToOids-comment.patch | text/plain | 1.9 KB |
0002-WIP-Attempt-to-put-back-intra-grant-inplace.spec-cov.patch | text/plain | 1.7 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Christoph Berg | 2025-06-11 15:53:15 | Re: CHECKPOINT unlogged data |
Previous Message | Dilip Kumar | 2025-06-11 14:32:08 | Re: Question on error code selection in conflict detection |