| From: | Xuneng Zhou <xunengzhou(at)gmail(dot)com> |
|---|---|
| To: | Andres Freund <andres(at)anarazel(dot)de> |
| Cc: | Alexander Lakhin <exclusion(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
| Subject: | Re: 048_vacuum_horizon_floor.pl hangs due to wakeup lost inside LockBufferForCleanup |
| Date: | 2026-06-23 02:15:00 |
| Message-ID: | CABPTF7VJ3jg05n68hf0iYB=ULsvH559ZJGLVjdUowRVZzcnWqg@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi Andres,
On Tue, Jun 23, 2026 at 4:04 AM Andres Freund <andres(at)anarazel(dot)de> wrote:
>
> Hi,
>
> On 2026-06-17 23:00:00 +0300, Alexander Lakhin wrote:
> > Hello hackers,
> > With this diagnostic addition:
> > --- a/src/backend/storage/buffer/bufmgr.c
> > +++ b/src/backend/storage/buffer/bufmgr.c
> > @@ -6743,9 +6743,12 @@ LockBufferForCleanup(Buffer buffer)
> > }
> > bufHdr->wait_backend_pgprocno = MyProcNumber;
> > PinCountWaitBuf = bufHdr;
> > - UnlockBufHdrExt(bufHdr, buf_state,
> > - BM_PIN_COUNT_WAITER, 0,
> > - 0);
> > +for (volatile int i = 0; i < 10000000; i++);
> > + buf_state = UnlockBufHdrExt(bufHdr, buf_state,
> > + BM_PIN_COUNT_WAITER, 0,
> > + 0);
> > + if (BUF_STATE_GET_REFCOUNT(buf_state) == 1)
> > + elog(LOG, "!!!LockBufferForCleanup| wakeup lost");
> > LockBuffer(buffer, BUFFER_LOCK_UNLOCK);
> >
>
> I don't think seeing BUF_STATE_GET_REFCOUNT(buf_state) == 1 guarantees that a
> wakeup was lost in any sort of way. Nothing prevents another backend from
> acquiring a pin on the buffer while we are waiting for a pincount, so
> occasionally seeing BUF_STATE_GET_REFCOUNT() == 1 here is to be expected.
Yeah, the log is misleading as observing
BUF_STATE_GET_REFCOUNT(buf_state) == 1 could be temporary, not
necessarily leading to lost wake-up.
A better wording seems to be:
elog(LOG, "buffer refcount is 1 after publishing cleanup-lock waiter");
> You also can't expect repeatedly doing
> UnlockBufHdrExt(set_bits=BM_PIN_COUNT_WAITER) to really make sense, because
> BM_PIN_COUNT_WAITER only works if it was set *before* another backend releases
> its pin, because otherwise the other backend obviously won't notice that the
> flag was set if it only was set after the pg_atomic_fetch_sub_u64() in
> UnpinBufferNoOwner().
This also makes sense to me. However the reproducer seems not to
repeatedly do UnlockBufHdrExt(set_bits=BM_PIN_COUNT_WAITER). It uses
a for loop as a delay before setting the BM_PIN_COUNT_WAITER, which
seems ok to me. Am I missing sth here?
>
> I don't think the problem is that we are loosing a signal, the problem is that
> we need to recheck if BUF_STATE_GET_REFCOUNT(buf_state) == 1 after setting
> BM_PIN_COUNT_WAITER. I think this is a bug in (depending on your perspective)
> either
>
> commit 5310fac6e0f
> Author: Andres Freund <andres(at)anarazel(dot)de>
> Date: 2025-11-06 16:43:16 -0500
>
> bufmgr: Use atomic sub for unpinning buffers
>
> or
>
> commit c75ebc657ffce8dab76471da31aafb79fbe3fda2
> Author: Andres Freund <andres(at)anarazel(dot)de>
> Date: 2025-11-06 16:42:10 -0500
>
> bufmgr: Allow some buffer state modifications while holding header lock
>
>
> Because before those commits, the buffer could not be unpinned while we held
> the buffer header lock. With them, this race opens up.
I agree with this as the root cause of the issue. Thanks for your analysis!
--
Regards,
Xuneng Zhou
HighGo Software Co., Ltd.
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Chao Li | 2026-06-23 02:18:11 | Re: [PATCH] Add pg_get_table_ddl() to reconstruct CREATE TABLE statements |
| Previous Message | Chao Li | 2026-06-23 02:11:52 | Fix variadic argument types for pg_get_xxx_ddl() functions |