Re: BUG: Cascading standby fails to reconnect after falling back to archive recovery

From: Marco Nenciarini <marco(dot)nenciarini(at)enterprisedb(dot)com>
To: Xuneng Zhou <xunengzhou(at)gmail(dot)com>
Cc: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: BUG: Cascading standby fails to reconnect after falling back to archive recovery
Date: 2026-03-21 10:37:40
Message-ID: CA+nrD2cufuehWJL9nmUYNixPiUACUJ2Z4X1ogHgYuiYndFm4gA@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Xuneng,

On Fri, Mar 21, 2026 at 10:07 AM Xuneng Zhou <xunengzhou(at)gmail(dot)com> wrote:
>
> I think this rounds down to the start of segment that contains
> targetPagePtr + reqLen. It does not round up. So if both standby have
> replayed the same record, the cascade's startpoint would land at the
> beginning of the current segment, which will provide a legitimate LSN
> to upstream server. This case would be fine. Am I missing something
> here?

You're right that it rounds down, but the value being rounded is not
the replay position. Let me trace the exact path:

1. XLogPageRead calls WaitForWALToBecomeAvailable(targetPagePtr + reqLen).
After archive recovery finishes reading segment N, the next page
request is for the first page of segment N+1, so this value is
already in segment N+1.

2. WaitForWALToBecomeAvailable sets ptr = RecPtr (line 3843) and
passes it to RequestXLogStreaming.

3. RequestXLogStreaming truncates to the segment boundary, but since
the value is already at (or just past) the start of segment N+1,
it stays at the start of N+1.

4. On the upstream side, GetStandbyFlushRecPtr returns replayPtr,
which is the end of the last replayed record inside segment N.

5. Start of N+1 > replayPtr in N => "ahead of the WAL flush position".

So the gap doesn't come from the truncation itself -- it comes from
archive recovery processing whole segment files. After both nodes
replay the same archived segment N, the cascade's next read position
is already in N+1 while the upstream reports a position inside N.
The truncation determines the exact startpoint (segment boundary)
but the "ahead" condition exists regardless of it.

I'll update the code comment in the patch to describe this more
precisely.

Best regards,
Marco

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Marco Nenciarini 2026-03-21 10:52:28 Re: BUG: Cascading standby fails to reconnect after falling back to archive recovery
Previous Message Xuneng Zhou 2026-03-21 10:29:15 Re: tablecmds: fix bug where index rebuild loses replica identity on partitions