Quick Links

Re: BUG: Cascading standby fails to reconnect after falling back to archive recovery

From:	Marco Nenciarini <marco(dot)nenciarini(at)enterprisedb(dot)com>
To:	Xuneng Zhou <xunengzhou(at)gmail(dot)com>
Cc:	Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: BUG: Cascading standby fails to reconnect after falling back to archive recovery
Date:	2026-03-17 09:31:23
Message-ID:	CA+nrD2cVZ2YdfQpk_qwFUzmkR4N5_8H9yL3NVodAmTq3gNDVpg@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Since this bug dates back to 9.3, the fix will likely need backpatching.
The v2 patch changes the walrcv_identify_system() signature, which would
be an ABI break on stable branches (walrcv_identify_system_fn is a
function pointer in the WalReceiverFunctionsType struct).

Attached is a backpatch-compatible variant that avoids the API change.
Instead of adding a parameter, libpqrcv_identify_system() stores the
flush position in a new global variable (WalRcvIdentifySystemLsn), and
the walreceiver reads it directly. The fix logic and TAP test are
otherwise identical.

For master I'd still prefer the v2 approach with the extended signature,
since it's cleaner and there's no ABI constraint.

Best regards,
Marco

Attachment	Content-Type	Size
v2-backpatch-0001-Fix-cascading-standby-reconnect-failure-after-archiv.patch	text/x-patch	12.7 KB

In response to

Re: BUG: Cascading standby fails to reconnect after falling back to archive recovery at 2026-03-16 21:49:44 from Marco Nenciarini

Responses

Re: BUG: Cascading standby fails to reconnect after falling back to archive recovery at 2026-03-17 12:13:36 from Xuneng Zhou

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Fujii Masao	2026-03-17 09:48:46	Re: Propagate XLogFindNextRecord error to callers
Previous Message	Amit Kapila	2026-03-17 09:25:53	Re: synchronized_standby_slots behavior inconsistent with quorum-based synchronous replication