Re: pg_rewind with cascade standby doesn't work well

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Aleksander Alekseev <aleksander(at)timescale(dot)com>
Cc: Kuwamura Masaki <kuwamura(at)db(dot)is(dot)i(dot)nagoya-u(dot)ac(dot)jp>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: pg_rewind with cascade standby doesn't work well
Date: 2023-09-12 06:10:17
Message-ID: ZQAAySzrxSdK5NNO@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Sep 11, 2023 at 07:04:30PM +0300, Aleksander Alekseev wrote:
> Many thanks for submitting the patch. I added it to the nearest open
> commitfest [1].
>
> IMO a test is needed that makes sure no one is going to break this in
> the future.

You definitely need more complex test scenarios for that. If you can
come up with new ways to make the TAP tests of pg_rewind mode modular
in handling more complicated node setups, that would be a nice
addition, for example.

> [1]: https://commitfest.postgresql.org/45/4559/

@@ -7951,6 +7951,15 @@ xlog_redo(XLogReaderState *record)
ereport(PANIC,
(errmsg("unexpected timeline ID %u (should be %u) in end-of-recovery record",
xlrec.ThisTimeLineID, replayTLI)));
+ /*
+ * Update the control file so that crash recovery can follow the timeline
+ * changes to this point.
+ */
+ LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+ ControlFile->minRecoveryPoint = lsn;
+ ControlFile->minRecoveryPointTLI = xlrec.ThisTimeLineID;

This patch is at least incorrect in its handling of crash recovery,
because these two should *never* be set in this case as we want to
replay up to the end of WAL. For example, see xlog.c or the top of
xlogrecovery.c about the assumptions behind these variables:
/* crash recovery should always recover to the end of WAL */
ControlFile->minRecoveryPoint = InvalidXLogRecPtr;
ControlFile->minRecoveryPointTLI = 0;

If an end-of-recovery record is replayed during crash recovery, these
assumptions are plain broken.

One thing that we could consider is to be more aggressive with
restartpoints when replaying this record for a standby, see a few
lines above the lines added by your patch, for example. And we could
potentially emulate a post-promotion restart point to get a refresh of
the control file as it should, with the correct code paths involved in
the updates of minRecoveryPoint when the checkpointer does the job.
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2023-09-12 06:18:05 Re: Make psql's qeury canceling test simple by using signal() routine of IPC::Run
Previous Message Peter Eisentraut 2023-09-12 05:42:45 Re: Make all Perl warnings fatal