Re: BUG: Cascading standby fails to reconnect after falling back to archive recovery

From: Xuneng Zhou <xunengzhou(at)gmail(dot)com>
To: Marco Nenciarini <marco(dot)nenciarini(at)enterprisedb(dot)com>
Cc: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: BUG: Cascading standby fails to reconnect after falling back to archive recovery
Date: 2026-03-17 12:13:36
Message-ID: CABPTF7Wp7YuH9=qyM8ESq1QpGGAkq3=nL+F=WmKv7gHpq1XPWQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 17, 2026 at 5:31 PM Marco Nenciarini
<marco(dot)nenciarini(at)enterprisedb(dot)com> wrote:
>
> Since this bug dates back to 9.3, the fix will likely need backpatching.
> The v2 patch changes the walrcv_identify_system() signature, which would
> be an ABI break on stable branches (walrcv_identify_system_fn is a
> function pointer in the WalReceiverFunctionsType struct).
>
> Attached is a backpatch-compatible variant that avoids the API change.
> Instead of adding a parameter, libpqrcv_identify_system() stores the
> flush position in a new global variable (WalRcvIdentifySystemLsn), and
> the walreceiver reads it directly. The fix logic and TAP test are
> otherwise identical.
>
> For master I'd still prefer the v2 approach with the extended signature,
> since it's cleaner and there's no ABI constraint.
>
> Best regards,
> Marco

I think that the ABI concern for backpatching is valid, and the
proposed workaround looks reasonable to me. Resetting
WalRcvIdentifySystemLsn before walrcv_identify_system() seems like a
sensible defensive move, so I’ve added it into v3. The TAP test has
also been updated as well.

--
Best,
Xuneng

Attachment Content-Type Size
v3-backpatch-0001-Fix-cascading-standby-reconnect-failure-after-arc.patch application/octet-stream 13.9 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jakub Wartak 2026-03-17 12:13:59 Re: pg_stat_io_histogram
Previous Message Amit Kapila 2026-03-17 12:05:32 Re: Skipping schema changes in publication