| From: | Andres Freund <andres(at)anarazel(dot)de> |
|---|---|
| To: | Alexander Lakhin <exclusion(at)gmail(dot)com> |
| Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
| Subject: | Re: 048_vacuum_horizon_floor.pl hangs due to wakeup lost inside LockBufferForCleanup |
| Date: | 2026-06-22 20:03:56 |
| Message-ID: | dxqtyud5euy6migdaafvpudsbn7bzcr7vjcc4vpfh2ffe3sk5a@zszfoxuiq7pl |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi,
On 2026-06-17 23:00:00 +0300, Alexander Lakhin wrote:
> Hello hackers,
> With this diagnostic addition:
> --- a/src/backend/storage/buffer/bufmgr.c
> +++ b/src/backend/storage/buffer/bufmgr.c
> @@ -6743,9 +6743,12 @@ LockBufferForCleanup(Buffer buffer)
> }
> bufHdr->wait_backend_pgprocno = MyProcNumber;
> PinCountWaitBuf = bufHdr;
> - UnlockBufHdrExt(bufHdr, buf_state,
> - BM_PIN_COUNT_WAITER, 0,
> - 0);
> +for (volatile int i = 0; i < 10000000; i++);
> + buf_state = UnlockBufHdrExt(bufHdr, buf_state,
> + BM_PIN_COUNT_WAITER, 0,
> + 0);
> + if (BUF_STATE_GET_REFCOUNT(buf_state) == 1)
> + elog(LOG, "!!!LockBufferForCleanup| wakeup lost");
> LockBuffer(buffer, BUFFER_LOCK_UNLOCK);
>
I don't think seeing BUF_STATE_GET_REFCOUNT(buf_state) == 1 guarantees that a
wakeup was lost in any sort of way. Nothing prevents another backend from
acquiring a pin on the buffer while we are waiting for a pincount, so
occasionally seeing BUF_STATE_GET_REFCOUNT() == 1 here is to be expected.
You also can't expect repeatedly doing
UnlockBufHdrExt(set_bits=BM_PIN_COUNT_WAITER) to really make sense, because
BM_PIN_COUNT_WAITER only works if it was set *before* another backend releases
its pin, because otherwise the other backend obviously won't notice that the
flag was set if it only was set after the pg_atomic_fetch_sub_u64() in
UnpinBufferNoOwner().
I don't think the problem is that we are loosing a signal, the problem is that
we need to recheck if BUF_STATE_GET_REFCOUNT(buf_state) == 1 after setting
BM_PIN_COUNT_WAITER. I think this is a bug in (depending on your perspective)
either
commit 5310fac6e0f
Author: Andres Freund <andres(at)anarazel(dot)de>
Date: 2025-11-06 16:43:16 -0500
bufmgr: Use atomic sub for unpinning buffers
or
commit c75ebc657ffce8dab76471da31aafb79fbe3fda2
Author: Andres Freund <andres(at)anarazel(dot)de>
Date: 2025-11-06 16:42:10 -0500
bufmgr: Allow some buffer state modifications while holding header lock
Because before those commits, the buffer could not be unpinned while we held
the buffer header lock. With them, this race opens up.
Greetings,
Andres Freund
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Andres Freund | 2026-06-22 20:11:23 | Re: 048_vacuum_horizon_floor.pl hangs due to wakeup lost inside LockBufferForCleanup |
| Previous Message | Tom Lane | 2026-06-22 19:45:15 | Fix unsafe coding in ResourceOwnerReleaseAll() |