| From: | Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com> |
|---|---|
| To: | SATYANARAYANA NARLAPURAM <satyanarlapuram(at)gmail(dot)com> |
| Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: LockHasWaiters() crashes on fast-path locks |
| Date: | 2026-03-25 22:06:14 |
| Message-ID: | CALj2ACUKM1KQC=NHESOJw=ZgjAXgxapJfAEovKN0n3ZF8Csr5w@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi,
On Wed, Mar 25, 2026 at 2:15 PM SATYANARAYANA NARLAPURAM
<satyanarlapuram(at)gmail(dot)com> wrote:
>
> Hi Hackers,
>
> LockHasWaiters() assumes that the LOCALLOCK's lock and proclock pointers are populated, but this is not the case for locks acquired via the fast-path optimization. Weak locks (< ShareUpdateExclusiveLock) on relations may not be stored in the shared lock hash table, and the LOCALLOCK entry is left with lock = NULL and proclock = NULL in such a case.
>
> If LockHasWaiters() is called for such a lock, it dereferences those NULL pointers when it reads proclock->holdMask and lock->waitMask, causing a segfault.
>
> The only existing caller is lazy_truncate_heap() in VACUUM, which queries LockHasWaitersRelation(rel, AccessExclusiveLock). Since AccessExclusiveLock is the strongest lock level, it is never fast-pathed, so the bug has never been triggered in practice. However, any new caller that passes a weak lock mode, for example, checking whether a DDL is waiting on an AccessShareLock will crash. The fix is to transfer the lock to the main lock table before we access them.
>
> Attached a patch to address this issue.
Nice find! It would be good to add a test case (perhaps in an existing
test extension even though we may not commit it; it can act as a
demo).
I see that this type of lock transfer is happening for prepared
statements (see AtPrepare_Locks [1]). However, I see the proposed
patch relying on lock == NULL for detecting whether the lock was
acquired using fast-path. Although this looks correct because if the
lock or proclock pointers are NULL, this identifies that the lock was
taken using fast-path. But for consistency purposes, can we have the
same check as that of AtPrepare_Locks?
[1]
/*
* If the local lock was taken via the fast-path, we need to move it
* to the primary lock table, or just get a pointer to the existing
* primary lock table entry if by chance it's already been
* transferred.
*/
if (locallock->proclock == NULL)
--
Bharath Rupireddy
Amazon Web Services: https://aws.amazon.com
| From | Date | Subject | |
|---|---|---|---|
| Next Message | David Rowley | 2026-03-25 22:09:36 | Re: Test timings are increasing too fast for cfbot |
| Previous Message | David Rowley | 2026-03-25 22:00:24 | Re: another autovacuum scheduling thread |