| From: | Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com> |
|---|---|
| To: | "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com> |
| Cc: | Alexander Lakhin <exclusion(at)gmail(dot)com>, "xunengzhou(at)gmail(dot)com" <xunengzhou(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
| Subject: | Re: t/035_standby_logical_decoding.pl might fail on attempt to read wrong timeline |
| Date: | 2026-06-08 08:47:48 |
| Message-ID: | aiaBtENl7tTf2MM8@bdtpg |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi,
On Mon, Jun 08, 2026 at 04:25:45AM +0000, Hayato Kuroda (Fujitsu) wrote:
> Hi Alexander, Bertrand, Xuneng,
>
> Thanks for seeing the failure. Our team also recognized but could not find the reason.
>
> > Yeah, it looks like there is a race condition here. I think we should check if
> > the insertion timeline has already been set (like the walsummarizer is doing).
>
> IIUC, the issue can happen if the walsender must read the WAL record generated
> after the promotion but the timeline could not be updated.
>
> However, I think logical_read_xlog_page() is called after the new WAL records
> are generated, i.e., am_cascading_walsender has already been false at that time.
> So not sure where is the race?
I ended up with this conclusion:
During promotion, there is a window where RecoveryInProgress() still
returns true but old timeline WAL segments have already been removed or
recycled by RemoveNonParentXlogFiles() in CleanupAfterArchiveRecovery().
This is because, in StartupXLOG(), WAL segments are cleaned up before
SharedRecoveryState transitions to RECOVERY_STATE_DONE.
If a walsender performing logical decoding calls logical_read_xlog_page()
during this window, it would get the old timeline from GetXLogReplayRecPtr(),
then attempt to open a WAL segment on that old timeline which no longer exists.
Attached:
0001: To fix this race
Fix by checking GetWALInsertionTimeLineIfSet() when RecoveryInProgress()
returns true. If InsertTimeLineID is already set (non-zero), the new timeline is
established and we use it directly, avoiding attempts to read from segments that
may have been removed.
0002: Adding a test in 035_standby_logical_decoding.pl
It makes use of a new injection point "promotion-after-wal-segment-cleanup" in
StartupXLOG(), right after CleanupAfterArchiveRecovery() removes old timeline
WAL segments but before SharedRecoveryState is set to RECOVERY_STATE_DONE.
The test fails without the fix in 0001 so it also somehow proves that the
diagnostic is right.
0003: Apply the same timeline fix to read_local_xlog_page_guts()
Indeed, it could hit the same race as mentioned by Xuneng-San.
0004: Add a test for 0003
Remark:
As far as the backpatching down to 16, it looks like 0001 to 0004 could be
backpatched as they are down to 17. For 16, we may want to also introduce
GetWALInsertionTimeLineIfSet().
I can have a closer look for the backpatch once we agree on how to fix those
races on master.
Regards,
--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
| Attachment | Content-Type | Size |
|---|---|---|
| v1-0001-Fix-race-condition-in-logical-decoding-timeline-s.patch | text/x-diff | 2.6 KB |
| v1-0002-Add-injection-point-test-for-logical-decoding-tim.patch | text/x-diff | 4.8 KB |
| v1-0003-Apply-the-same-timeline-fix-to-read_local_xlog_pa.patch | text/x-diff | 1.8 KB |
| v1-0004-Add-SQL-path-test-for-read_local_xlog_page_guts-t.patch | text/x-diff | 3.9 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Ashutosh Sharma | 2026-06-08 08:53:28 | Re: synchronized_standby_slots behavior inconsistent with quorum-based synchronous replication |
| Previous Message | Chao Li | 2026-06-08 08:46:46 | Fix unqualified catalog references in psql describe queries |