An unfortunate logging behavior when (mis)configuring recovery.conf

From: Daniel Farina <drfarina(at)acm(dot)org>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: An unfortunate logging behavior when (mis)configuring recovery.conf
Date: 2010-10-27 23:18:21
Message-ID: AANLkTikbddK-AtEZ6=Wy2rU6U9iEEop9jdqKKf+pf-GM@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello list,

I just encountered an interesting undesirable behavior in Postgres
9.0's error reporting dealing with (trivially) malformed
recovery.conf, as might be the case when setting up hot standby. In
this case, there were some missing fields, and they were checked as
they are supposed to be in xlog.c:readRecoveryCommandFile, resulting
in an ereport(FATAL, ...).

The problem appears to be that when starting postmaster and entering
recovery there is a good chance on at least some machines that the
message accompanying the FATAL ereport do not get written to the log,
seemingly because the signal notifying postmaster of a startup child's
mental breakdown gets processed first, hitting the following code
block:

/*
* Unexpected exit of startup process (including FATAL exit)
* during PM_STARTUP is treated as catastrophic. There are no
* other processes running yet, so we can just exit.
*/
if (pmState == PM_STARTUP && !EXIT_STATUS_0(exitstatus))
{
LogChildExit(LOG, _("startup process"),
pid, exitstatus);
ereport(LOG,
(errmsg("aborting startup due to startup process
failure")));
ExitPostmaster(1);
}

As a result, "aborting startup due to startup process failure" is seen
in the log, but not the messages seen in
xlog.c:readRecoveryCommandFile that triggered the failure.

To get around it this problem, I ran postgres with --single, and then
everything flushed as anticipated and the misconfiguration was easy to
pick out.

The machine was an ec2 machine.

Credit to Heroku and Jason Dusek for taking the time to communicate
this problem and let me mess with it for a while.

fdr

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Joshua D. Drake 2010-10-27 23:22:37 Re: max_wal_senders must die
Previous Message Josh Berkus 2010-10-27 23:13:42 Re: max_wal_senders must die