Re: Implement waiting for wal lsn replay: reloaded

From: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To: Xuneng Zhou <xunengzhou(at)gmail(dot)com>
Cc: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Álvaro Herrera <alvherre(at)kurilemu(dot)de>, Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Michael Paquier <michael(at)paquier(dot)xyz>, jian he <jian(dot)universality(at)gmail(dot)com>, Tomas Vondra <tomas(at)vondra(dot)me>, Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru>
Subject: Re: Implement waiting for wal lsn replay: reloaded
Date: 2026-04-06 19:49:08
Message-ID: CAPpHfdtNiSqQCu+YtTYcc+K4q9FwtZuAtQ5Qs+KoaZZM4QyYTA@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Apr 6, 2026 at 10:15 AM Xuneng Zhou <xunengzhou(at)gmail(dot)com> wrote:
>
> On Mon, Apr 6, 2026 at 2:01 PM Xuneng Zhou <xunengzhou(at)gmail(dot)com> wrote:
> >
> > On Mon, Apr 6, 2026 at 11:26 AM Xuneng Zhou <xunengzhou(at)gmail(dot)com> wrote:
> > >
> > > Hi Alexander,
> > >
> > > On Sun, Apr 5, 2026 at 8:31 PM Alexander Korotkov <aekorotkov(at)gmail(dot)com> wrote:
> > > >
> > > > On Thu, Jan 29, 2026 at 9:47 AM Alexander Korotkov <aekorotkov(at)gmail(dot)com> wrote:
> > > > > On Tue, Jan 27, 2026 at 3:14 AM Xuneng Zhou <xunengzhou(at)gmail(dot)com> wrote:
> > > > > > Heikki spotted a misplaced wake-up call for replay waiters in
> > > > > > PerformWalRecovery. He suggested that the WaitLSNWakeup needs to be
> > > > > > invoked immediately after wal record is applied to avoid the potential
> > > > > > missed wake-ups when recovery stops/pauses/promotes. It makes sense to
> > > > > > me. Please check the attached patch to fix that.
> > > > >
> > > > > Pushed, thank you!
> > > >
> > > > I've assembled small patches, which I think worth pushing before v19 FF.
> > > >
> > > > 1. Avoid syscache lookup in WaitStmtResultDesc(). This is [1] patch,
> > > > but I've removed redundant comment in ExecWaitStmt(), which explained
> > > > the same as WaitStmtResultDesc() comment.
> > > > 2. Use WAIT FOR LSN in wait_for_catchup(). I made the following
> > > > changes: fallback to polling on not_in_recovery result instead of
> > > > croaking, avoid separate pg_is_in_recovery() query, comment why we
> > > > may face the recovery conflict and why it is safe to detect this by
> > > > the error string.
> > > > 3. A paragraph to the docs about possible recovery conflicts and their reason.
> > > >
> > > > I'm going to push this on Monday if no objections.
> > > >
> > > > Links.
> > > > 1. https://www.postgresql.org/message-id/CABPTF7U%2BSUnJX_woQYGe%3D%3DR9Oz%2B-V6X0VO2stBLPGfJmH_LEhw%40mail.gmail.com
> > > > 2. https://www.postgresql.org/message-id/CABPTF7X0n%3DR50z2fBpj3EbYYz04Ab0-DHJa%2BJfoAEny62QmUdg%40mail.gmail.com
> > >
> > > I'll review them shortly.
> > >
> >
> > Here're some comments:
> >
> > 1) Patch 2 checks for not_in_recovery, but WAIT FOR ... NO_THROW
> > returns “not in recovery”, which appears to prevent the
> > promoted-standby fallback described in the patch from triggering.
> > Instead, it seems to result in a hard failure.
> >
> > There are some behavior changes that I overlooked earlier:
> >
> > 2) Patch 2 changes the helper from "wait until timeout if there is no
> > active replication connection" to "connect to the standby immediately
> > and fail on ordinary connection failures". This appears to differ from
> > the current contract and may affect cases where the standby is down,
> > restarting, or not yet accepting SQL.
> >
> > "If there is no active replication connection from this peer, waits
> > until poll_query_until timeout."
> >
> > 3) In patch 2, we returns success as soon as the standby locally
> > reaches the target LSN, but the existing helper is explicitly defined
> > in terms of pg_stat_replication ... state = 'streaming' on the
> > upstream. So the new path can report success even when there is no
> > active replication connection from that peer anymore.
> >
> > "The replication connection must be in a streaming state."
> >
> > I’m not sure whether these semantic changes are intended.
> >
>
> After some thoughts, these semantic changes appear to be improvements
> over the artifacts of polling-based behavior. The pg_stat_replication
> ... state = 'streaming' is a precondition of polling works rather than
> a post-condition check. The retry timeout also has nothing to do with
> the edge cases and the state of the standby.
>
> In practice, wait_for_catchup() is called when the standby is expected
> to be up. If it's not, one of two things is true:
>
> The standby is briefly restarting -> start() already waits for
> readiness before returning, so this is a non-issue.
> The standby is genuinely down -> waiting 180 seconds to time out
> wastes CI time compared to failing fast.
>
> I’ve addressed point 1 and updated the comments accordingly to reflect
> the new behavior. Please check it.

Thank you, I've pushed your version of patchset. I made two minor
corrections for patch #2: mention default mode value in the header
comment, and fallback to polling on has_wal_read_bug sparc64+ext4 bug.

------
Regards,
Alexander Korotkov
Supabase

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Lukas Fittl 2026-04-06 19:50:59 Re: Add custom EXPLAIN options support to auto_explain
Previous Message Daniel Gustafsson 2026-04-06 19:36:15 Re: Add errdetail() with PID and UID about source of termination signal