Re: BUG #18575: Sometimes pg_rewind mistakenly assumes that nothing needs to be done.

From: Maxim Michkov <m(dot)michkov(at)arenadata(dot)io>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: BUG #18575: Sometimes pg_rewind mistakenly assumes that nothing needs to be done.
Date: 2025-09-02 14:12:14
Message-ID: a4896a62ceab7167824d48b8f0afe01aef6140aa.camel@arenadata.io
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Fri, 2024-08-09 at 18:26 +0300, Heikki Linnakangas wrote:
> 3. When pg_rewind has nothing to do, the target server is left
> unmodified, in a state such that when you restart it, it will replay
> all
> the WAL it has locally in pg_wal first, before connecting to the
> primary. Even though the target is a direct ancestor of the source
> and
> hence it *can* follow the WAL to the source's position without
> rewinding, it doesn't mean that it *will* actually do so.
>
> The attached  changes it so that it updates the control file in that
> case, setting minRecoveryPoint and minRecoveryPointTLI to point to
> the
> source's current WAL position. That way, when you start it up, it
> will
> follow the timeline history to reach that point. (This requires
> fixing
> issue 2, because otherwise it still won't follow the history
> correctly
> to reach the minRecoveryPointTLI)

Hello, I am currently researching a very similar issue. It seems like
your patches work correctly (thank you for those), but I've found a
weird edge case for patch 0004.

Basically, if you don't do anything to the primary after promotion (so
that its last checkpoint is *exactly* at the switchpoint), pg_rewind
with patch 0004 sets minRecoveryPoint to the exact LSN of the
switchpoint. After that the standby won't start, producing errors like
`requested timeline 2 does not contain minimum recovery point 0/4711F38
on timeline 2`.

This happens because of this check in xlogrecovery.c:
```
if (!XLogRecPtrIsInvalid(ControlFile->minRecoveryPoint) &&
tliOfPointInHistory(ControlFile->minRecoveryPoint - 1,
expectedTLEs) !=
ControlFile->minRecoveryPointTLI)
ereport(FATAL, /* ... */);
```
Because it checks TLI of minRecoveryPoint - 1, it expects to see the
timeline before the switch (timeline 1), but we actually want it to go
to timeline 2.
It seems like minRecoveryPoint is supposed to indicate minimum allowed
end of WAL, so when setting it to ensure some WAL is processed (like
the checkpoint WAL on another timeline) we have to set it to LSN+1
instead. Do you think this is the correct fix, or if instead -1 in
xlogrecovery.c should be removed?

Attached is a patch to 010 TAP test that reproduces this behavior.

Attachment Content-Type Size
0001-Tap-test-for-minRecoveryPoint-matching-switchpoint.patch text/x-patch 2.5 KB

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Greg Sabino Mullane 2025-09-02 15:11:01 Re: empty,query_id, pg_stat_activity
Previous Message Daniel Gustafsson 2025-09-02 11:54:06 Re: BUG #19039: UNREACHABLE_CODE: Remove unreachable code in network_send - replace with assertion