[PATCH] Fix WAIT FOR LSN standby_write/standby_flush for archive recovery cases

From: SATYANARAYANA NARLAPURAM <satyanarlapuram(at)gmail(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Subject: [PATCH] Fix WAIT FOR LSN standby_write/standby_flush for archive recovery cases
Date: 2026-04-15 06:44:23
Message-ID: CAHg+QDeHkMcLBKaBu6sxigL2gUsHXye3QQs14zKyD25BnPNAvA@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Alexnader, Hackers,

GetCurrentLSNForWaitType() for WAIT_LSN_TYPE_STANDBY_WRITE and
WAIT_LSN_TYPE_STANDBY_FLUSH previously relied on the WAL receiver's
tracked write/flush positions (GetWalRcvWriteRecPtr/GetWalRcvFlushRecPtr).
There are two scenarios where WAIT FOR LSN queries can be stalled though
replay is making progress. Breaking it down to two to give clarity on
setups but
the underlying problem is the same.

There are two scenarios here:

(1). When the standby is disconnected from the primary and switched to WAL
archive mode, it continues to be in that mode until no more WAL is
available to replay
and then switch to streaming mode. Until then WAIT FOR LSN calls get stuck
on the
standby though replay catches up beyond the stale WAL receiver position.
Switching
XLog source from archive to streaming is separately tracked in [1].

(2). In the case of Archive recovery, no WAL receiver process exists, so
these
functions return InvalidXLogRecPtr (0/0). WAIT FOR LSN with standby_flush or
standby_write modes would always time out, even for WAL that has been
fully replayed.

Fix by falling back to the replay LSN (GetXLogReplayRecPtr) when the WAL
receiver position is invalid or behind replay. This is correct because any
WAL that has been replayed has necessarily already been written and flushed
to disk. Attached the repro test case.

[1]:
https://www.postgresql.org/message-id/CAHg+QDdLmfpS0n0U3U+e+dw7X7jjEOsJJ0aLEsrtxs-tUyf5Ag@mail.gmail.com

Thanks,
Satya

Attachment Content-Type Size
0001-Fix-WAIT-FOR-LSN-standby_write-standby_flush-for-arc.patch application/octet-stream 2.8 KB
0001-Add-TAP-test-for-WAIT-FOR-LSN-during-archive-recover.patch application/octet-stream 7.4 KB

Browse pgsql-hackers by date

  From Date Subject
Next Message shveta malik 2026-04-15 06:47:17 Re: synchronized_standby_slots behavior inconsistent with quorum-based synchronous replication
Previous Message Jakub Wartak 2026-04-15 06:37:48 Re: proposal - queryid can be used as filter for auto_explain