Re: master in standby mode croaks

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: master in standby mode croaks
Date: 2010-04-10 13:02:28
Message-ID: n2j603c8f071004100602j8387e33al8102fe07549c89d1@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Apr 2, 2010 at 5:36 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> I can't duplicate this error based upon what you have said.

I fooled around with this some more and I think I know what's going
on. The error message I received was:

recovery connections cannot start because the recovery_connections
parameter is disabled on the WAL source server

This is generated when !checkPoint.XLogStandbyInfoMode. That, in
turn, is set on the master to the results of XLogStandbyInfoActive(),
which is defined as XLogRequestRecoveryConnections && XLogIsNeeded().
XLogIsNeeded() is defined as XLogArchivingActive() || (max_wal_senders
> 0), and XLogArchivingActive() is defined as XLogArchiveMode. So
when you expand it all out, this error message gets triggered when the
following condition does not hold on the master:

XLogRequestRecoveryConnections && (XLogArchiveMode || (max_wal_senders > 0))

So this can fail in either of two ways: (1)
XLogRequestRecoveryConnections (aka recovery_connections) might be
false, which is the situation described in the error message, or (2)
XLogArchiveMode (archive_mode) might be false and at the same time
max_wal_senders might be zero. As it happens, the default
configuration of the system is recovery_connections = true,
archive_mode = false, max_wal_senders = 0, so with an out-of-the-box
config it fails for the reason that isn't the one described in the
error message.

One possible approach here is to improve the error message, but it
seems to me that having the ability of Hot Standby to run on the slave
partially controlled by three different GUCs is awfully complicated.
I think the root of the problem here is that recovery_connections
controls one behavior on the primary (whether or not we WAL-log
certain information needed for HS) and a completely unrelated behavior
on the standby (whether or not we try to allow read-only backends into
the system). In 8.4 and prior, it was always the job of archive_mode
to decide whether WAL-logging was needed. Maybe we should go back to
that and make it an enum:

wal_mode = {standby | archive | off}

...Robert

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2010-04-10 13:15:25 Re: GSoC - proposal - Materialized Views in PostgreSQL
Previous Message Boszormenyi Zoltan 2010-04-10 12:36:41 Re: pg_ctl stop -m immediate on the primary server inflates sequences