| From: | Xuneng Zhou <xunengzhou(at)gmail(dot)com> |
|---|---|
| To: | Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com> |
| Cc: | Alexander Lakhin <exclusion(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
| Subject: | Re: t/035_standby_logical_decoding.pl might fail on attempt to read wrong timeline |
| Date: | 2026-06-06 12:56:10 |
| Message-ID: | CABPTF7UqSh4LCA_Lf2owJg5VBs5jteBfb5q4Jt9BiOdq8bMH8w@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi Bertrand,
On Sat, Jun 6, 2026 at 7:07 PM Bertrand Drouvot
<bertranddrouvot(dot)pg(at)gmail(dot)com> wrote:
>
> Hi Alexander,
>
> On Sat, Jun 06, 2026 at 12:00:00PM +0300, Alexander Lakhin wrote:
> > Hello hackers,
> >
> > That is, walsender requested WAL segment for timeline 1, while in a
> > successful run, it reads WAL for timeline 2.
> >
> > I've managed to reproduce this failure with:
>
> Thanks for the report and the repro!
>
> > As far as I can see, the timeline is chosen in logical_read_xlog_page()
> > depending on the recovery state:
> > am_cascading_walsender = RecoveryInProgress();
> >
> > if (am_cascading_walsender)
> > GetXLogReplayRecPtr(&currTLI);
> > else
> > currTLI = GetWALInsertionTimeLine();
>
> Yeah, it looks like there is a race condition here. I think we should check if
> the insertion timeline has already been set (like the walsummarizer is doing).
>
> I'll work on a fix early next week.
This looks like the right direction to fix. We may want to apply
similar logic to read_local_xlog_page_guts as well. Although the
failure is reported in walsender, SQL logical decoding uses the local
WAL reader and has the same recovery/TLI pattern.
--
Regards,
Xuneng Zhou
HighGo Software Co., Ltd.
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Álvaro Herrera | 2026-06-06 13:08:05 | Re: First draft of PG 19 release notes |
| Previous Message | Andrew Dunstan | 2026-06-06 12:41:32 | Re: Fix domain fast defaults on empty tables |