Re: standalone backend PANICs during recovery

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Bernd Helmle <mailings(at)oopsware(dot)de>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: standalone backend PANICs during recovery
Date: 2016-08-20 16:41:48
Message-ID: 2086.1471711308@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I wrote:
> In short, I don't think control should have been here at all. My proposal
> for a fix is to force EnableHotStandby to remain false in a standalone
> backend.

I tried to reproduce Bernd's problem by starting a standalone backend in
a data directory that was configured as a hot standby slave, and soon
found that there are much bigger issues than this. The startup sequence
soon tries to wait for WAL to arrive, which in HEAD uses

WaitLatch(&XLogCtl->recoveryWakeupLatch,
WL_LATCH_SET | WL_TIMEOUT | WL_POSTMASTER_DEATH,
5000L);

which immediately elog(FATAL)s because a standalone backend has no parent
postmaster and so postmaster_alive_fds[] isn't set. But if it didn't do
that, it'd wait forever because of course there is no active WAL receiver
process that would ever provide more WAL.

The only way that you'd ever get to a command prompt is if somebody made a
promotion trigger file, which would cause the startup code to promote the
cluster into master status, which does not really seem like something that
would be a good idea in Bernd's proposed use case of "investigating a
problem".

Alternatively, if we were to force standby_mode off in a standalone
backend, it would come to the command prompt right away but again it would
have effectively promoted the cluster to master. That is certainly not
something we should ever do automatically.

So at this point I'm pretty baffled as to what the actual use-case is
here. I am tempted to say that a standalone backend should refuse to
start at all if a recovery.conf file is present. If we do want to
allow the case, we need some careful thought about what it should do.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Petr Jelinek 2016-08-20 17:24:28 Re: Logical Replication WIP
Previous Message Bruce Momjian 2016-08-20 16:35:36 Re: replication slots replicated to standbys?