Quick Links

Re: Implement waiting for wal lsn replay: reloaded

From:	Xuneng Zhou <xunengzhou(at)gmail(dot)com>
To:	Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Álvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc:	Tomas Vondra <tomas(at)vondra(dot)me>, Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject:	Re: Implement waiting for wal lsn replay: reloaded
Date:	2025-08-07 15:00:50
Message-ID:	CABPTF7U7_iYEXn3xeV29-xwFWqTc+2N4=CdL7qEzPPg=hv8eqA@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hi,

Thanks for working on this.

I’ve just come across this thread and haven’t had a chance to dig into
the patch yet, but I’m keen to review it soon. In the meantime, I have
a quick question: is WAIT FOR REPLY intended mainly for user-defined
functions, or can internal code invoke it as well?

During a recent performance run [1] I noticed heavy polling in
read_local_xlog_page_guts(). Heikki’s comment from a few months ago
also hints that we could replace this check–sleep–repeat loop with the
condition-variable (CV) infrastructure used by walsender:

/*
* Loop waiting for xlog to be available if necessary
*
* TODO: The walsender has its own version of this function, which uses a
* condition variable to wake up whenever WAL is flushed. We could use the
* same infrastructure here, instead of the check/sleep/repeat style of
* loop.
*/

Because read_local_xlog_page_guts() waits for a specific flush or
replay LSN, polling becomes inefficient when the wait is long. I built
a POC patch that swaps polling for CVs, but a single global CV (or
even separate “flush” and “replay” CVs) isn’t ideal:

The wake-up routines don’t know which LSN each waiter cares about, so
they’d have to broadcast on every flush/replay. Caching the minimum
outstanding LSN could reduce spuriously awakened waiters, yet wouldn’t
eliminate them—multiple backends might wait for different LSNs
simultaneously. A more precise solution would require a request queue
that maps waiters to target LSNs and issues targeted wake-ups, adding
complexity.

Walsender accepts the potential broadcast overhead by using two cvs
for different waiters, so it might be acceptable for
read_local_xlog_page_guts() as well. However, if WAIT FOR REPLY
becomes available to backend code, we might leverage it to eliminate
the polling for waiting replay in read_local_xlog_page_guts() without
introducing a bespoke dispatcher. I’d appreciate any thoughts on
whether that use case is in scope.

Best,
Xuneng

[1] https://www.postgresql.org/message-id/CABPTF7VuFYm9TtA9vY8ZtS77qsT+yL_HtSDxUFnW3XsdB5b9ew@mail.gmail.com

In response to

Re: Implement waiting for wal lsn replay: reloaded at 2025-08-05 13:47:07 from Álvaro Herrera

Responses

Re: Implement waiting for wal lsn replay: reloaded at 2025-08-08 07:08:49 from Alexander Korotkov

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Aleksander Alekseev	2025-08-07 15:17:17	Re: VM corruption on standby
Previous Message	Tom Lane	2025-08-07 14:58:56	Re: headerscheck warnings with late-model gcc