Re: Implement waiting for wal lsn replay: reloaded

From: Xuneng Zhou <xunengzhou(at)gmail(dot)com>
To: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Álvaro Herrera <alvherre(at)kurilemu(dot)de>, Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Michael Paquier <michael(at)paquier(dot)xyz>, jian he <jian(dot)universality(at)gmail(dot)com>, Tomas Vondra <tomas(at)vondra(dot)me>, Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru>
Subject: Re: Implement waiting for wal lsn replay: reloaded
Date: 2026-04-15 08:30:17
Message-ID: CABPTF7W=P_PiKQ5SW-WmadC9vJ=q67MOwGM6iNRwSERF7OF0WA@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Apr 9, 2026 at 11:21 PM Alexander Korotkov <aekorotkov(at)gmail(dot)com> wrote:
>
> On Wed, Apr 8, 2026 at 7:59 AM Xuneng Zhou <xunengzhou(at)gmail(dot)com> wrote:
> > > > Patch 0001 looks OK for me.
> > > > Regarding patch 0002. Changes made for GetCurrentLSNForWaitType()
> > > > looks reliable for me. PerformWalRecovery() sets replayed positions
> > > > before starting recovery, and in turn before standby can accept
> > > > connections. So, changes to WalReceiverMain() don't look necessary to
> > > > me.
> > >
> > > Yeah, GetCurrentLSNForWaitType seems to be the right place to place
> > > the fix. Please see the attached patch 2.
> > >
> > > I also noticed another relevent problem:
> > >
> > > During pure archive recovery (no walreceiver), a backend that issues
> > > 'WAIT FOR LSN ... MODE 'standby_write' with a target ahead of the
> > > current replay position will sleep forever; the startup process
> > > replays past the target but only wakes 'STANDBY_REPLAY' waiters.
> > >
> > > This also affects mixed scenarios: the walreceiver may lag behind
> > > replay (e.g., archive restore has delivered WAL faster than
> > > streaming), so a 'standby_write' waiter could be waiting on WAL that
> > > replay has already consumed.
> > >
> > > I will write a patch to address this soon.
> > >
> >
> > Here is the patch.
>
> I've assembled all the pending patches together.
> 0001 adds memory barrier to GetWalRcvWriteRecPtr() as suggested by
> Andres off-list.
> 0002 is basically [1] by Xuneng, but revised given we have a memory
> barrier in 0001, and my proposal to do ResetLatch() unconditionally
> similar to our other Latch-based loops.
> 0003 and 0004 are [2] by Xuneng.
> 0005 is [3] by Xuneng.
>
> I'm going to add them to Commitfest to run CI over them, and have a
> closer look over them tomorrow.
>
>
> Links.
> 1. https://www.postgresql.org/message-id/CABPTF7Wjk_FbOghyr09Rzu6T2bh-L_KBMqHK%2BzhRXpssU0STyQ%40mail.gmail.com
> 2. https://www.postgresql.org/message-id/CABPTF7X0iV%3DkGC4gjsTj4NvK_NNEJGM3YTc7Obxs5GOiYoMhEw%40mail.gmail.com
> 3. https://www.postgresql.org/message-id/CABPTF7UBdEfyxATWntmCfoJrwB6iPrnhkXO7y_Avmqc2bOn27A%40mail.gmail.com

I've added the patches to Commitfest.

[1] https://commitfest.postgresql.org/patch/6678/

--
Best,
Xuneng

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Nisha Moond 2026-04-15 08:52:04 Re: Support EXCEPT for TABLES IN SCHEMA publications
Previous Message 胡传文 2026-04-15 08:28:44 [PATCH] Fix wrong comment in JsonTablePlanJoinNextRow()