| From: | Xuneng Zhou <xunengzhou(at)gmail(dot)com> |
|---|---|
| To: | Alexander Korotkov <aekorotkov(at)gmail(dot)com> |
| Cc: | Andres Freund <andres(at)anarazel(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Álvaro Herrera <alvherre(at)kurilemu(dot)de>, Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Michael Paquier <michael(at)paquier(dot)xyz>, jian he <jian(dot)universality(at)gmail(dot)com>, Tomas Vondra <tomas(at)vondra(dot)me>, Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru> |
| Subject: | Re: Implement waiting for wal lsn replay: reloaded |
| Date: | 2026-04-21 04:03:30 |
| Message-ID: | CABPTF7WJ35p7uidJJZs7fzxBtbVL_0xSFUdZ2Fe8pXh00e=Mxw@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Tue, Apr 21, 2026 at 2:46 AM Alexander Korotkov <aekorotkov(at)gmail(dot)com> wrote:
> The updated patchset is attached. It includes improved coverage as
> suggested by Andres upthread. And documentation that WAIT FOR LSN is
> timeline-blind (per off-list discussion with Xuneng).
I revised the test patch 6 to make the new cases check the intended
WAIT FOR behavior more directly, and to avoid cases where the test
could pass for the wrong reason.
The fresh walreceiver restart test now distinguishes what we can
observe from what is only covered indirectly.
'pg_last_wal_receive_lsn()' reports 'flushedUpto', not 'writtenUpto',
so the test now describes that state accurately and covers
'writtenUpto' through the 'standby_write' result. This seems
appropriate to me since the two positions are seeded in the places and
conditions. Test for flush lsn should also help verify write lsn.
The fencepost tests were split by the actual frontier being tested.
'standby_replay' uses 'pg_last_wal_replay_lsn()', while
'standby_flush' uses 'pg_last_wal_receive_lsn()'. This avoids treating
a replay-derived LSN as if it were also the exact write/flush
boundary. I left 'standby_write' out of the exact fencepost helper
because its frontier is not SQL-visible once walreceiver is stopped.
The async wakeup case now starts the waiter while replay is still
paused, so it must actually sleep before replay and walreceiver are
allowed to advance.
The cascading timeline-switch test now checks the 'WAIT FOR ...
NO_THROW' status from background psql stdout. The previous log-marker
pattern could pass after unexpected returned status, includingn
'timeout', because the following statement would still run. The
'received_tli > 1' check remains, but only as confirmation that the
downstream followed the new timeline; the 'success' status proves the
wait completed as intended.
Please check it.
--
Best,
Xuneng
| Attachment | Content-Type | Size |
|---|---|---|
| v5-0003-Remove-redundant-WAIT-FOR-LSN-caller-side-pre-che.patch | application/octet-stream | 5.2 KB |
| v5-0002-Fix-memory-ordering-in-WAIT-FOR-LSN-wakeup-mechan.patch | application/octet-stream | 4.3 KB |
| v5-0005-Wake-standby_write-standby_flush-waiters-from-the.patch | application/octet-stream | 5.9 KB |
| v5-0001-Use-barrier-semantics-when-reading-writing-writte.patch | application/octet-stream | 3.1 KB |
| v5-0004-Use-replay-position-as-floor-for-WAIT-FOR-LSN-sta.patch | application/octet-stream | 8.7 KB |
| v5-0006-Improve-WAIT-FOR-LSN-test-coverage.patch | application/octet-stream | 12.6 KB |
| v5-0007-Document-that-WAIT-FOR-LSN-is-timeline-blind.patch | application/octet-stream | 1.9 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Michael Paquier | 2026-04-21 04:17:26 | Re: Typo Fixes and Patch |
| Previous Message | jian he | 2026-04-21 03:57:59 | Re: FOR PORTION OF does not recompute GENERATED STORED columns that depend on the range column |