Re: [PATCH] Prevent repeated deadlock-check signals in standby buffer pin waits

From: Xuneng Zhou <xunengzhou(at)gmail(dot)com>
To: JoongHyuk Shin <sjh910805(at)gmail(dot)com>
Cc: assam258(at)gmail(dot)com, Ilmar Yunusov <tanswis42(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: [PATCH] Prevent repeated deadlock-check signals in standby buffer pin waits
Date: 2026-06-29 12:52:51
Message-ID: CABPTF7VCdsfmvZpvXtuNHYhBYK4ivnfPAAM=vFXsgqYv2p2eyA@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Jun 28, 2026 at 6:06 PM JoongHyuk Shin <sjh910805(at)gmail(dot)com> wrote:
>
> Hi Henson,
>
> Thanks for the careful read.
>
> Both timeouts can be set in a single SIGALRM, since handle_sig_alarm() fires
> every timeout that is already due. The if/else order is deliberate. The delay
> timeout is checked first because the cancel supersedes the deadlock check, so
> the both-set case resolves to the cancel, and no defensive branch is needed.
>
> You're right about the coverage too. Only the delay > deadlock_timeout regime
> is exercised today, and -1 is the one where the second wait has no startup-side
> timer to fall back on. This patch will be rebased on top of the deadlock-detector
> fix that lands first, and I'll add a -1 regime test as part of that rebase.

The patch LGTM overall. However, there seems to be a rare race issue
that could lead to lost wake-up and reply-stall.

A problematic interleaving seems possible:

1) The startup process is waiting for a buffer pin, with
BM_PIN_COUNT_WAITER set.
2) STANDBY_DEADLOCK_TIMEOUT fires. handle_sig_alarm() sets the startup
process latch, and StandbyDeadLockHandler() sets
got_standby_deadlock_timeout = true.
3) WaitLatch() returns, but before ProcWaitForSignal() calls
ResetLatch(), the final unpinner unpins the buffer.
4) WakePinCountWaiter() sees BUF_STATE_GET_REFCOUNT(buf_state) == 1,
clears BM_PIN_COUNT_WAITER, and calls ProcSendSignal() for the startup
process.
5) ProcSendSignal() calls SetLatch(), but the latch is already set by
the deadlock timeout, so this wakeup is coalesced.
6) ProcWaitForSignal() then calls ResetLatch(), clearing the latch.
7) v3 sees got_standby_deadlock_timeout, sends the deadlock-check
signal, clears the flag, then enters the new second
ProcWaitForSignal().
8) At that point the buffer is already ready, but there may be no
future unpin to wake the startup process as BM_PIN_COUNT_WAITER has
already been cleared.

If a finite standby delay is set, startup might sleep until
STANDBY_TIMEOUT. If the delay is set to -1, no standby timeout is
armed, this could lead to an infinite waiting until an unrelated latch
event occurs.

To address this, I think this optimization should be rebased on the
code in the related fix [1]. Let the resolve rechecks after each wake:

ProcWaitForSignal(WAIT_EVENT_BUFFER_CLEANUP);

if (BufferIsReadyForCleanup(bufid + 1))
break;

If we are already the last pinner here, just return rather than enter
a second wait; otherwise, re-registers the startup as the pin-count
waiter before waiting again.

[1] https://www.postgresql.org/message-id/flat/44c24dcf-5710-410f-b1b6-d10b315f3d51%40postgrespro.ru

--
Regards,
Xuneng Zhou
HighGo Software Co., Ltd.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2026-06-29 13:19:29 Per-thread leak in ECPG's memory.c
Previous Message Vivek Gadge 2026-06-29 12:41:17 Guidance on Calculating max_locks_per_transaction for a Highly Partitioned Environment