Re: PANIC during crash recovery of a recently promoted standby

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PANIC during crash recovery of a recently promoted standby
Date: 2018-05-11 22:41:33
Message-ID: 20180511224133.GA1891@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, May 11, 2018 at 12:09:58PM -0300, Alvaro Herrera wrote:
> Yeah, I had this exact comment, but I was unable to come up with a test
> case that would cause a problem.

pg_ctl promote would wait for the control file to be updated, so you
cannot use it in the TAP tests to trigger the promotion. Still I think
I found one after waking up? Please note I have not tested it:
- Use a custom trigger file and then trigger promotion with a signal.
- Use a sleep command in recovery_end_command to increase the window, as
what matters is sleeping after CreateEndOfRecoveryRecord updates the
control file.
- Issue a restart point on the standby, which will update the control
file.
- Stop the standby with immediate mode.
- Start the standby, it should see unreferenced pages.

> Hmm. Can we change the control file in released branches? (It should
> be possible to make the new server understand both old and new formats,
> but I think this is breaking new ground and it looks easy to introduce
> more bugs there.)

We definitely can't, even if the new value is added at the end of
DBState :(

A couple of wild ideas, not tested, again after waking up:
1) We could also abuse of existing values by using the existing
DB_IN_CRASH_RECOVERY or DB_STARTUP. Still that's not completely true as
the cluster may be open for business as a hot standby.
2) Invent a new special value for XLogRecPtr, normally impossible to
reach, which uses high bits.
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Justin Pryzby 2018-05-12 00:38:46 allow psql to watch \dt
Previous Message Andres Freund 2018-05-11 22:32:04 Re: Having query cache in core