| From: | Xuneng Zhou <xunengzhou(at)gmail(dot)com> |
|---|---|
| To: | Marco Nenciarini <marco(dot)nenciarini(at)enterprisedb(dot)com> |
| Cc: | Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org |
| Subject: | Re: BUG: Cascading standby fails to reconnect after falling back to archive recovery |
| Date: | 2026-03-17 01:15:22 |
| Message-ID: | CABPTF7Vb0y2JsmWb2m6sodH9Ttde=++KP9Bwk-3KpnbcT3e43w@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Tue, Mar 17, 2026 at 9:04 AM Xuneng Zhou <xunengzhou(at)gmail(dot)com> wrote:
>
> Hi,
>
> Thanks for the patch.
>
> On Tue, Mar 17, 2026 at 5:49 AM Marco Nenciarini
> <marco(dot)nenciarini(at)enterprisedb(dot)com> wrote:
> >
> > Attached is a v2 patch that implements the "handshake clamp" approach
> > Xuneng suggested. Rather than tracking lastStreamedFlush in
> > process-local state (which doesn't survive a cascade restart, as
> > Fujii-san demonstrated), it uses the WAL flush position already
> > returned by IDENTIFY_SYSTEM.
> >
> > The walreceiver now checks the upstream's flush position before issuing
> > START_REPLICATION. If the requested startpoint is ahead (on the same
> > timeline), it waits for wal_retrieve_retry_interval and retries. This
> > works across restarts since it queries the upstream's live position on
> > every connection attempt, and requires no new state variables.
> >
> > When timelines differ, we let START_REPLICATION handle the timeline
> > negotiation as before.
> >
> > The patch includes a TAP test (053_cascade_reconnect.pl) that
> > reproduces the scenario and verifies the fix.
> >
>
> I haven’t looked into it in detail yet, but it looks good overall.
> I’ll test it further and verify that the issue has been resolved.
One thing I’m not sure about is whether we need to create a standalone
test file for this patch, or if it would fit well within existing TAP
tests.
I found several places for integration:
001_stream_rep.pl: it already has a primary -> standby -> cascading
standby setup, and it even touches primary_conninfo reload behavior.
But it is already a large mixed-purpose file, and this bug needs a
fairly specific archive-fallback reconnection story. Adding it there
would make that file even less focused.
025_stuck_on_old_timeline.pl: this is the nearest thematic neighbor
since it combines cascading replication and archive/stream
interactions. But it is really about timeline-following after
promotion, not “downstream advances via archive and then must
reconnect to an upstream that is still behind”.
048_vacuum_horizon_floor.pl: it already exercises stopping and
restarting walreceiver via primary_conninfo reload, but it has nothing
to do with archive fallback or cascading reconnect logic.
The failure scenario is specific enough, and the three-node setup plus
archive fallback plus reconnect check seems to be a coherent
reproducer on its own.
--
Best,
Xuneng
| From | Date | Subject | |
|---|---|---|---|
| Next Message | zengman | 2026-03-17 01:26:19 | Re: SQL Property Graph Queries (SQL/PGQ) |
| Previous Message | Chao Li | 2026-03-17 01:12:13 | Re: tablecmds: reject CLUSTER ON for partitioned tables earlier |