Re: PANIC during crash recovery of a recently promoted standby

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PANIC during crash recovery of a recently promoted standby
Date: 2018-05-14 03:38:36
Message-ID: 20180514033836.GD1528@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, May 12, 2018 at 07:41:33AM +0900, Michael Paquier wrote:
> pg_ctl promote would wait for the control file to be updated, so you
> cannot use it in the TAP tests to trigger the promotion. Still I think
> I found one after waking up? Please note I have not tested it:
> - Use a custom trigger file and then trigger promotion with a signal.
> - Use a sleep command in recovery_end_command to increase the window, as
> what matters is sleeping after CreateEndOfRecoveryRecord updates the
> control file.
> - Issue a restart point on the standby, which will update the control
> file.
> - Stop the standby with immediate mode.
> - Start the standby, it should see unreferenced pages.

I have been looking at that this morning, and actually I have been able
to create a test case where minRecoveryPoint goes backwards using
Pavan's patch. Using a sleep in recovery_end_command has proved to not
be enough so I had to patch the backend with a couple of sleeps and some
processing, mainly:
- One sleep in CreateRestartPoint to make a restart point wait before
updating the control file, which I set at 5s.
- One sleep just after calling CreateEndOfRecoveryRecord, which has been
set at 20s.
- Trigger an asynchronous checkpoint using IPC::Run::start.
- Trigger a promotion with a trigger file and SIGUSR2 to the
postmaster.

The rest of the test is crafted with some magic wait number and adds
some logs to ease the monitoring of the issue. In order to get a crash,
I think that you would need to crash the backend after creating the last
WAL records which generate the invalid page references, and also slow
down the last restart point which makes minRecoveryPoint go backwards,
which is err... Complicated to make deterministic. Still if you apply
the patch attached you would see log entries on the standby as follows:
2018-05-14 12:24:15.065 JST [17352] LOG: selected new timeline ID: 2
2018-05-14 12:24:15.074 JST [17352] LOG: archive recovery complete
2018-05-14 12:24:15.074 JST [17352] WARNING: CreateEndOfRecoveryRecord: minRecoveryPoint is 0/32C4258 before update
2018-05-14 12:24:15.074 JST [17352] WARNING: CreateEndOfRecoveryRecord: minRecoveryPoint is 0/0 after update
2018-05-14 12:24:17.067 JST [17353] WARNING: checkPointCopy.redo =0/30B3D70, lastCheckPoint.redo = 0/31BC208
2018-05-14 12:24:17.067 JST [17353] WARNING: CreateRestartPoint: minRecoveryPoint is 0/0 before update, lastCheckPointEndPtr is 0/31BC2B0
2018-05-14 12:24:17.067 JST [17353] WARNING: CreateRestartPoint: minRecoveryPoint is 0/31BC2B0 after update

So minRecoveryRecord can go definitely go backwards here, which is not
good. Attached is a patch which includes Pavan's fix on top of the test
case I crafted with what is in upthread. You just need to apply it on
HEAD.
--
Michael

Attachment Content-Type Size
min-recovery-point-backwards.patch text/x-diff 10.2 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2018-05-14 04:27:29 Re: Compiler warnings with --enable-dtrace
Previous Message Amit Langote 2018-05-14 02:55:41 Re: \d doesn't show partitioned table foreign keys