From: | Xuneng Zhou <xunengzhou(at)gmail(dot)com> |
---|---|
To: | Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Álvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> |
Cc: | Tomas Vondra <tomas(at)vondra(dot)me>, Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru>, pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: Implement waiting for wal lsn replay: reloaded |
Date: | 2025-08-07 15:00:50 |
Message-ID: | CABPTF7U7_iYEXn3xeV29-xwFWqTc+2N4=CdL7qEzPPg=hv8eqA@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
Thanks for working on this.
I’ve just come across this thread and haven’t had a chance to dig into
the patch yet, but I’m keen to review it soon. In the meantime, I have
a quick question: is WAIT FOR REPLY intended mainly for user-defined
functions, or can internal code invoke it as well?
During a recent performance run [1] I noticed heavy polling in
read_local_xlog_page_guts(). Heikki’s comment from a few months ago
also hints that we could replace this check–sleep–repeat loop with the
condition-variable (CV) infrastructure used by walsender:
/*
* Loop waiting for xlog to be available if necessary
*
* TODO: The walsender has its own version of this function, which uses a
* condition variable to wake up whenever WAL is flushed. We could use the
* same infrastructure here, instead of the check/sleep/repeat style of
* loop.
*/
Because read_local_xlog_page_guts() waits for a specific flush or
replay LSN, polling becomes inefficient when the wait is long. I built
a POC patch that swaps polling for CVs, but a single global CV (or
even separate “flush” and “replay” CVs) isn’t ideal:
The wake-up routines don’t know which LSN each waiter cares about, so
they’d have to broadcast on every flush/replay. Caching the minimum
outstanding LSN could reduce spuriously awakened waiters, yet wouldn’t
eliminate them—multiple backends might wait for different LSNs
simultaneously. A more precise solution would require a request queue
that maps waiters to target LSNs and issues targeted wake-ups, adding
complexity.
Walsender accepts the potential broadcast overhead by using two cvs
for different waiters, so it might be acceptable for
read_local_xlog_page_guts() as well. However, if WAIT FOR REPLY
becomes available to backend code, we might leverage it to eliminate
the polling for waiting replay in read_local_xlog_page_guts() without
introducing a bespoke dispatcher. I’d appreciate any thoughts on
whether that use case is in scope.
Best,
Xuneng
From | Date | Subject | |
---|---|---|---|
Next Message | Aleksander Alekseev | 2025-08-07 15:17:17 | Re: VM corruption on standby |
Previous Message | Tom Lane | 2025-08-07 14:58:56 | Re: headerscheck warnings with late-model gcc |