| From: | Xuneng Zhou <xunengzhou(at)gmail(dot)com> |
|---|---|
| To: | Vitaly Davydov <v(dot)davydov(at)postgrespro(dot)ru> |
| Cc: | Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org |
| Subject: | Re: Deadlock detector fails to activate on a hot standby replica |
| Date: | 2026-06-30 03:14:54 |
| Message-ID: | CABPTF7WeTmUXcwXh9WsM8DnU+RB8Hw-U+EpZqkf1Lz8D2rBMaQ@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Sat, Jun 27, 2026 at 2:11 PM Xuneng Zhou <xunengzhou(at)gmail(dot)com> wrote:
>
> On Fri, Jun 26, 2026 at 5:31 PM Vitaly Davydov <v(dot)davydov(at)postgrespro(dot)ru> wrote:
> >
> > Hi Fujii-san, Xuneng Zhou,
> >
> > > 1) It looks that this patch is doing some refactoring around
> > > LockBufferForCleanup. The part being touched seems flaky itself on
> > > HEAD, should we fix the issue in the buffer side first? [1]
> >
> > Thank you for mentioning it. It seems, there is a conflict with
> > the another patch due to the introduction of RegisterPinCountWaiter
> > function, that can not be resolved automatically. I'm ok to wait for
> > the conflicting fix to be released. It seems, our patch should be
> > reconsidered after the conflicting patch release.
>
> OK.
>
> > > 2) buf_state &= ~BM_PIN_COUNT_WAITER in stable versions seems not
> > > necessary to me as LockBufferForCleanup will handle the clearing for
> > > us.
> >
> > It seems it was my mistake when I resolved conflicts when cherry-picking
> > the changes from other branches. I removed this line in my recent v 6.1
> > attached patches. Thank you for catching it.
>
> Thanks.
>
> Here are some additional comments after further review:
>
> 3) should we let RegisterPinCountWaiter do less?
>
> Currently, the function does a lot like:
> RegisterPinCountWaiter()
> {
> check other waiter;
> maybe unlock;
> maybe elog(ERROR);
> register;
> unlock header;
> }
>
> We need to handle the unlock & elog for two
> callers(BufferIsReadyForCleanup and LockBufferForCleanup) with
> different needs, which could be error-prone sometimes.
>
> + Assert((buf_state & BM_PIN_COUNT_WAITER) == 0 ||
> + bufHdr->wait_backend_pgprocno == MyProcNumber);
> +
> + if ((buf_state & BM_PIN_COUNT_WAITER) != 0 &&
> + bufHdr->wait_backend_pgprocno != MyProcNumber)
> + {
> + UnlockBufHdr(bufHdr);
> + elog(ERROR, "multiple processes attempting to wait for pincount 1");
> + }
> +
>
> For example, unlocking the header lock only is sufficient for
> BufferIsReadyForCleanup but not for LockBufferForCleanup as we hold
> the extra content lock. For LockBufferForCleanup, the check seems
> duplicated as we already did it before invoking this function. So to
> make this work, we may need to distinguish these two callers.
>
> Another potential issue of doing safe checks here is the order of
> assert and error out:
> In assert build, if the condition is true, then the header lock and
> content lock would not be released properly. So we may need to adjust
> the order.
>
> This leads me to wonder -- would it be better to let this help
> function to do less, like just for registration and let the callers to
> handle others.
>
> 4) the already-expired standby-delay path may not wake new pin holders
>
> If ltime has already passed, the patch sends
> RECOVERY_CONFLICT_BUFFERPIN once before entering the loop, then waits
> inside the new loop. What if another backend pins the buffer after
> that broadcast?
>
> Before the patch, this seems fine because we return to
> BufferIsReadyForCleanup and enter the function
> ResolveRecoveryConflictWithBufferPin to do the re-check of ltime and
> send the conflict signal once more.
>
> With the new inner loop, BufferIsReadyForCleanup() can re-register and
> continue waiting, but no timeout is active and no new cancelling
> request is sent. So recovery can continue waiting for a buffer pin
> even though max_standby_*_delay has already expired.
>
> To fix this, should we add an in-loop check: if ltime != 0 &&
> GetCurrentTimestamp() >= ltime in here:
>
> + if (got_standby_delay_timeout)
> + {
> + SendRecoveryConflictWithBufferPin(RECOVERY_CONFLICT_BUFFERPIN);
> + break;
> + }
>
> 5) The name of BufferIsReadyForCleanup seems broader than the actual protocol.
>
> The function name seems not align with the intended use of it as the
> comment suggested:
>
> *
> * This is only for the hot-standby path in LockBufferForCleanup(), via
> * ResolveRecoveryConflictWithBufferPin(), after ProcWaitForSignal() returns.
> * The caller must already be registered as the shared buffer's
> * BM_PIN_COUNT_WAITER.
>
> Would it be better to rename it to something like
> RecheckCleanupLockPinWait or RecheckBufferPinWaiter?
I took another look at the patch. Here is an another thought:
6) We might want to be more elaborative on the commit message
From personal experience, the message seems a bit hard to follow if
one does not read the related discussion. I like your analysis of the
problem at the start of this thread [1]. Would it be sensible to
summarize and reuse some part of it for the message?
The motivation of this patch is to fix deadlock detector activation,
but the wait-loop change has a slightly broader benefit. It also
improves the standby timeout path indirectly by avoiding needless
timeout teardown and re-entry on unrelated wakeups, and by keeping any
already-fired timeout state until the loop handles it. Should we
mention this as an additional benefit, or even frame the message
broader like:
Fix the buffer-pin recovery-conflict wait loop.
[1] https://www.postgresql.org/message-id/44c24dcf-5710-410f-b1b6-d10b315f3d51%40postgrespro.ru
--
Regards,
Xuneng Zhou
HighGo Software Co., Ltd.
| From | Date | Subject | |
|---|---|---|---|
| Next Message | shveta malik | 2026-06-30 03:30:09 | Re: Proposal: Conflict log history table for Logical Replication |
| Previous Message | Tom Lane | 2026-06-30 03:11:01 | Re: PROPERTY GRAPH pg_dump ACL minimization |