Re: t/035_standby_logical_decoding.pl might fail on attempt to read wrong timeline

From: Xuneng Zhou <xunengzhou(at)gmail(dot)com>
To: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>
Cc: Alexander Lakhin <exclusion(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: t/035_standby_logical_decoding.pl might fail on attempt to read wrong timeline
Date: 2026-06-06 12:56:10
Message-ID: CABPTF7UqSh4LCA_Lf2owJg5VBs5jteBfb5q4Jt9BiOdq8bMH8w@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Bertrand,

On Sat, Jun 6, 2026 at 7:07 PM Bertrand Drouvot
<bertranddrouvot(dot)pg(at)gmail(dot)com> wrote:
>
> Hi Alexander,
>
> On Sat, Jun 06, 2026 at 12:00:00PM +0300, Alexander Lakhin wrote:
> > Hello hackers,
> >
> > That is, walsender requested WAL segment for timeline 1, while in a
> > successful run, it reads WAL for timeline 2.
> >
> > I've managed to reproduce this failure with:
>
> Thanks for the report and the repro!
>
> > As far as I can see, the timeline is chosen in logical_read_xlog_page()
> > depending on the recovery state:
> > am_cascading_walsender = RecoveryInProgress();
> >
> > if (am_cascading_walsender)
> > GetXLogReplayRecPtr(&currTLI);
> > else
> > currTLI = GetWALInsertionTimeLine();
>
> Yeah, it looks like there is a race condition here. I think we should check if
> the insertion timeline has already been set (like the walsummarizer is doing).
>
> I'll work on a fix early next week.

This looks like the right direction to fix. We may want to apply
similar logic to read_local_xlog_page_guts as well. Although the
failure is reported in walsender, SQL logical decoding uses the local
WAL reader and has the same recovery/TLI pattern.

--
Regards,
Xuneng Zhou
HighGo Software Co., Ltd.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Álvaro Herrera 2026-06-06 13:08:05 Re: First draft of PG 19 release notes
Previous Message Andrew Dunstan 2026-06-06 12:41:32 Re: Fix domain fast defaults on empty tables