| From: | Xuneng Zhou <xunengzhou(at)gmail(dot)com> |
|---|---|
| To: | Alexander Korotkov <aekorotkov(at)gmail(dot)com> |
| Cc: | Andres Freund <andres(at)anarazel(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Álvaro Herrera <alvherre(at)kurilemu(dot)de>, Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Michael Paquier <michael(at)paquier(dot)xyz>, jian he <jian(dot)universality(at)gmail(dot)com>, Tomas Vondra <tomas(at)vondra(dot)me>, Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru> |
| Subject: | Re: Implement waiting for wal lsn replay: reloaded |
| Date: | 2026-05-01 02:44:00 |
| Message-ID: | CABPTF7W-gaO=FAkhda=_pDQJjLne68ioNHHU8vuB4iEnswR1=w@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi Alexander,
On Wed, Apr 29, 2026 at 5:01 AM Alexander Korotkov <aekorotkov(at)gmail(dot)com>
wrote:
> On Tue, Apr 21, 2026 at 7:03 AM Xuneng Zhou <xunengzhou(at)gmail(dot)com> wrote:
> >
> > On Tue, Apr 21, 2026 at 2:46 AM Alexander Korotkov <aekorotkov(at)gmail(dot)com>
> wrote:
> >
> > > The updated patchset is attached. It includes improved coverage as
> > > suggested by Andres upthread. And documentation that WAIT FOR LSN is
> > > timeline-blind (per off-list discussion with Xuneng).
> >
> > I revised the test patch 6 to make the new cases check the intended
> > WAIT FOR behavior more directly, and to avoid cases where the test
> > could pass for the wrong reason.
> >
> > The fresh walreceiver restart test now distinguishes what we can
> > observe from what is only covered indirectly.
> > 'pg_last_wal_receive_lsn()' reports 'flushedUpto', not 'writtenUpto',
> > so the test now describes that state accurately and covers
> > 'writtenUpto' through the 'standby_write' result. This seems
> > appropriate to me since the two positions are seeded in the places and
> > conditions. Test for flush lsn should also help verify write lsn.
> >
> > The fencepost tests were split by the actual frontier being tested.
> > 'standby_replay' uses 'pg_last_wal_replay_lsn()', while
> > 'standby_flush' uses 'pg_last_wal_receive_lsn()'. This avoids treating
> > a replay-derived LSN as if it were also the exact write/flush
> > boundary. I left 'standby_write' out of the exact fencepost helper
> > because its frontier is not SQL-visible once walreceiver is stopped.
> > The async wakeup case now starts the waiter while replay is still
> > paused, so it must actually sleep before replay and walreceiver are
> > allowed to advance.
> >
> > The cascading timeline-switch test now checks the 'WAIT FOR ...
> > NO_THROW' status from background psql stdout. The previous log-marker
> > pattern could pass after unexpected returned status, includingn
> > 'timeout', because the following statement would still run. The
> > 'received_tli > 1' check remains, but only as confirmation that the
> > downstream followed the new timeline; the 'success' status proves the
> > wait completed as intended.
> >
> > Please check it.
>
> LGTM, I've added some comments for new functions in 0006. I propose
> to push this patchset. Probably something is still missing and we
> will have to go back to this. But it seems to make a lot of aspects
> much better.
>
I reviewed the patchset and found a potential issue in the test for patch
5, similar to the log-checking problem in the cascading timeline-switch
test. I've applied a minor fix to address it. Other parts LGTM.
Best,
Xuneng
| Attachment | Content-Type | Size |
|---|---|---|
| v8-0004-Use-replay-position-as-floor-for-WAIT-FOR-LSN-sta.patch | application/octet-stream | 9.8 KB |
| v8-0007-Document-that-WAIT-FOR-LSN-is-timeline-blind.patch | application/octet-stream | 1.9 KB |
| v8-0003-Remove-redundant-WAIT-FOR-LSN-caller-side-pre-che.patch | application/octet-stream | 5.2 KB |
| v8-0006-Improve-WAIT-FOR-LSN-test-coverage.patch | application/octet-stream | 14.0 KB |
| v8-0002-Fix-memory-ordering-in-WAIT-FOR-LSN-wakeup-mechan.patch | application/octet-stream | 4.3 KB |
| v8-0001-Use-barrier-semantics-when-reading-writing-writte.patch | application/octet-stream | 3.0 KB |
| v8-0005-Wake-standby_write-standby_flush-waiters-from-the.patch | application/octet-stream | 5.4 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Richard Guo | 2026-05-01 02:47:20 | Re: Fix HAVING-to-WHERE pushdown with nondeterministic collations |
| Previous Message | Bharath Rupireddy | 2026-05-01 01:30:00 | Re: Fix race condition in pg_get_publication_tables with concurrent DROP TABLE |