| From: | Xuneng Zhou <xunengzhou(at)gmail(dot)com> |
|---|---|
| To: | Alexander Korotkov <aekorotkov(at)gmail(dot)com> |
| Cc: | Andres Freund <andres(at)anarazel(dot)de>, Álvaro Herrera <alvherre(at)kurilemu(dot)de>, pgsql-hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Michael Paquier <michael(at)paquier(dot)xyz>, jian he <jian(dot)universality(at)gmail(dot)com>, Tomas Vondra <tomas(at)vondra(dot)me>, Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru> |
| Subject: | Re: Implement waiting for wal lsn replay: reloaded |
| Date: | 2025-12-02 03:08:01 |
| Message-ID: | CABPTF7VZgB0O1eYd-6BmYGw2a6qQN1XTts0a2VshSG5xMRObPQ@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi,
On Mon, Dec 1, 2025 at 12:33 PM Xuneng Zhou <xunengzhou(at)gmail(dot)com> wrote:
>
> Hi hackers,
>
> On Tue, Nov 25, 2025 at 7:51 PM Xuneng Zhou <xunengzhou(at)gmail(dot)com> wrote:
> >
> > Hi!
> >
> > > > > > At the moment, the WAIT FOR LSN command supports only the replay mode.
> > > > > > If we intend to extend its functionality more broadly, one option is
> > > > > > to add a mode option or something similar. Are users expected to wait
> > > > > > for flush(or others) completion in such cases? If not, and the TAP
> > > > > > test is the only intended use, this approach might be a bit of an
> > > > > > overkill.
> > > > >
> > > > > I would say that adding mode parameter seems to be a pretty natural
> > > > > extension of what we have at the moment. I can imagine some
> > > > > clustering solution can use it to wait for certain transaction to be
> > > > > flushed at the replica (without delaying the commit at the primary).
> > > > >
> > > > > ------
> > > > > Regards,
> > > > > Alexander Korotkov
> > > > > Supabase
> > > >
> > > > Makes sense. I'll play with it and try to prepare a follow-up patch.
> > > >
> > > > --
> > > > Best,
> > > > Xuneng
> > >
> > > In terms of extending the functionality of the command, I see two
> > > possible approaches here. One is to keep mode as a mandatory keyword,
> > > and the other is to introduce it as an option in the WITH clause.
> > >
> > > Syntax Option A: Mode in the WITH Clause
> > >
> > > WAIT FOR LSN '0/12345' WITH (mode = 'replay');
> > > WAIT FOR LSN '0/12345' WITH (mode = 'flush');
> > > WAIT FOR LSN '0/12345' WITH (mode = 'write');
> > >
> > > With this option, we can keep "replay" as the default mode. That means
> > > existing TAP tests won’t need to be refactored unless they explicitly
> > > want a different mode.
> > >
> > > Syntax Option B: Mode as Part of the Main Command
> > >
> > > WAIT FOR LSN '0/12345' MODE 'replay';
> > > WAIT FOR LSN '0/12345' MODE 'flush';
> > > WAIT FOR LSN '0/12345' MODE 'write';
> > >
> > > Or a more concise variant using keywords:
> > >
> > > WAIT FOR LSN '0/12345' REPLAY;
> > > WAIT FOR LSN '0/12345' FLUSH;
> > > WAIT FOR LSN '0/12345' WRITE;
> > >
> > > This option produces a cleaner syntax if the intent is simply to wait
> > > for a particular LSN type, without specifying additional options like
> > > timeout or no_throw.
> > >
> > > I don’t have a clear preference among them. I’d be interested to hear
> > > what you or others think is the better direction.
> > >
> >
> > I've implemented a patch that adds MODE support to WAIT FOR LSN
> >
> > The new grammar looks like:
> >
> > ——
> > WAIT FOR LSN '<lsn>' [MODE { REPLAY | WRITE | FLUSH }] [WITH (...)]
> > ——
> >
> > Two modes added: flush and write
> >
> > Design decisions:
> >
> > 1. MODE as a separate keyword (not in WITH clause) - This follows the
> > pattern used by LOCK command. It also makes the common case more
> > concise.
> >
> > 2. REPLAY as the default - When MODE is not specified, it defaults to REPLAY.
> >
> > 3. Keywords rather than strings - Using `MODE WRITE` rather than `MODE 'write'`
> >
> > The patch set includes:
> > -------
> > 0001 - Extend xlogwait infrastructure with write and flush wait types
> >
> > Adds WAIT_LSN_TYPE_WRITE and WAIT_LSN_TYPE_FLUSH to WaitLSNType enum,
> > along with corresponding wait events and pairing heaps. Introduces
> > GetCurrentLSNForWaitType() to retrieve the appropriate LSN based on
> > wait type, and adds wakeup calls in walreceiver for write/flush
> > events.
> >
> > -------
> > 0002 - Add pg_last_wal_write_lsn() SQL function
> >
> > Adds a new SQL function that returns the current WAL write position on
> > a standby using GetWalRcvWriteRecPtr(). This complements existing
> > pg_last_wal_receive_lsn() (flush) and pg_last_wal_replay_lsn()
> > functions, enabling verification of WAIT FOR LSN MODE WRITE in TAP
> > tests.
> >
> > -------
> > 0003 - Add MODE parameter to WAIT FOR LSN command
> >
> > Extends the parser and executor to support the optional MODE
> > parameter. Updates documentation with new syntax and mode
> > descriptions. Adds TAP tests covering all three modes including
> > mixed-mode concurrent waiters.
> >
> > -------
> > 0004 - Add tab completion for WAIT FOR LSN MODE parameter
> >
> > Adds psql tab completion support: completes MODE after LSN value,
> > completes REPLAY/WRITE/FLUSH after MODE keyword, and completes WITH
> > after mode selection.
> >
> > -------
> > 0005 - Use WAIT FOR LSN in PostgreSQL::Test::Cluster::wait_for_catchup()
> >
> > Replaces polling-based wait_for_catchup() with WAIT FOR LSN when the
> > target is a standby in recovery, improving test efficiency by avoiding
> > repeated queries.
> >
> > The WRITE and FLUSH modes enable scenarios where applications need to
> > ensure WAL has been received or persisted on the standby without
> > waiting for replay to complete.
> >
> > Feedback welcome.
> >
>
> Here is the updated v2 patch set. Most of the updates are in patch 3.
>
> Changes from v1:
>
> Patch 1 (Extend wait types in xlogwait infra)
> - Renamed enum values for consistency (WAIT_LSN_TYPE_REPLAY →
> WAIT_LSN_TYPE_REPLAY_STANDBY, etc.)
>
> Patch 2 (pg_last_wal_write_lsn):
> - Clarified documentation and comment
> - Improved pg_proc.dat description
>
> Patch 3 (MODE parameter):
> - Replaced direct cast with explicit switch statement for WaitLSNMode
> → WaitLSNType conversion
> - Improved FLUSH/WRITE mode documentation with verification function references
> - TAP tests (7b, 7c, 7d): Added walreceiver control for concurrency,
> explicit blocking verification via poll_query_until, and log-based
> completion verification via wait_for_log
> - Fix the timing issue in wait for all three sessions to get the
> errors after promotion of tap test 8.
>
> --
> Best,
> Xuneng
Here is the updated v3. The changes are made to patch 3:
- Refactor duplicated TAP test code by extracting helper routines for
starting and stopping walreceiver.
- Increase the number of concurrent WRITE and FLUSH waiters in tests
7b and 7c from three to five, matching the number in test 7a.
--
Best,
Xuneng
| Attachment | Content-Type | Size |
|---|---|---|
| v3-0005-Use-WAIT-FOR-LSN-in.patch | application/octet-stream | 3.1 KB |
| v3-0002-Add-pg_last_wal_write_lsn-SQL-function.patch | application/octet-stream | 3.7 KB |
| v3-0003-Add-MODE-parameter-to-WAIT-FOR-LSN-command.patch | application/octet-stream | 38.8 KB |
| v3-0004-Add-tab-completion-for-WAIT-FOR-LSN-MODE-paramete.patch | application/octet-stream | 3.2 KB |
| v3-0001-Extend-xlogwait-infrastructure-with-write-and-flu.patch | application/octet-stream | 10.5 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Michael Paquier | 2025-12-02 03:12:28 | Re: [Patch] Windows relation extension failure at 2GB and 4GB |
| Previous Message | Andres Freund | 2025-12-02 03:05:52 | Re: [Patch] Windows relation extension failure at 2GB and 4GB |