RE: t/035_standby_logical_decoding.pl might fail on attempt to read wrong timeline

From: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>
To: 'Bertrand Drouvot' <bertranddrouvot(dot)pg(at)gmail(dot)com>, Alexander Lakhin <exclusion(at)gmail(dot)com>, "xunengzhou(at)gmail(dot)com" <xunengzhou(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: RE: t/035_standby_logical_decoding.pl might fail on attempt to read wrong timeline
Date: 2026-06-08 04:25:45
Message-ID: OS9PR01MB1214908BA67A7811BD6281208F51C2@OS9PR01MB12149.jpnprd01.prod.outlook.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Alexander, Bertrand, Xuneng,

Thanks for seeing the failure. Our team also recognized but could not find the reason.

> Yeah, it looks like there is a race condition here. I think we should check if
> the insertion timeline has already been set (like the walsummarizer is doing).

Sorry for stupid question; I tried to reproduce the failure but could not, see attached.

IIUC, the issue can happen if the walsender must read the WAL record generated
after the promotion but the timeline could not be updated.

However, I think logical_read_xlog_page() is called after the new WAL records
are generated, i.e., am_cascading_walsender has already been false at that time.
So not sure where is the race?

Best regards,
Hayato Kuroda
FUJITSU LIMITED

Attachment Content-Type Size
0001-WIP-try-reproducing-the-race-condition-for-promotion.patch application/octet-stream 6.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Bapat 2026-06-08 04:59:09 Re: Fix DROP PROPERTY GRAPH "unsupported object class" error
Previous Message Kyotaro Horiguchi 2026-06-08 04:18:41 Re: [PATCH] Fix loose polling in 019_replslot_limit.pl test