recovery_connections cannot start (was Re: master in standby mode croaks)

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: recovery_connections cannot start (was Re: master in standby mode croaks)
Date: 2010-04-22 16:04:09
Message-ID: g2t603c8f071004220904gcd645664hccb27a90cca80c23@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Apr 17, 2010 at 6:52 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Sat, Apr 17, 2010 at 6:41 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>> On Sat, 2010-04-17 at 17:44 -0400, Robert Haas wrote:
>>
>>> > I will change the error message.
>>>
>>> I gave a good deal of thought to trying to figure out a cleaner
>>> solution to this problem than just changing the error message and
>>> failed.  So let's change the error message.  Of course I'm not quite
>>> sure what we should change it TO, given that the situation is the
>>> result of an interaction between three different GUCs and we have no
>>> way to distinguish which one(s) are the problem.
>>
>> "You need all three" covers it.
>
> Actually you need standby_connections and either archive_mode=on or
> max_wal_senders>0, I think.

One way we could fix this is use 2 bits rather than 1 for
XLogStandbyInfoMode. One bit could indicate that either
archive_mode=on or max_wal_senders>0, and the second bit could
indicate that recovery_connections=on. If the second bit is unset, we
could emit the existing complaint:

recovery connections cannot start because the recovery_connections
parameter is disabled on the WAL source server

If the other bit is unset, then we could instead complain:

recovery connections cannot start because archive_mode=off and
max_wal_senders=0 on the WAL source server

If we don't want to use two bits there, it's hard to really describe
all the possibilities in a reasonable number of characters. The only
thing I can think of is to print a message and a hint:

recovery_connections cannot start due to incorrect settings on the WAL
source server
HINT: make sure recovery_connections=on and either archive_mode=on or
max_wal_senders>0

I haven't checked whether the hint would be displayed in the log on
the standby, but presumably we could make that be the case if it's not
already.

I think the first way is better because it gives the user more
specific information about what they need to fix. Thinking about how
each case might happen, since the default for recovery_connections is
'on', it seems that recovery_connections=off will likely only be an
issue if the user has explicitly turned it off. The other case, where
archive_mode=off and max_wal_senders=0, will likely only occur if
someone takes a snapshot of the master without first setting up
archiving or SR. Both of these will probably happen relatively
rarely, but since we're burning a whole byte for XLogStandbyInfoMode
(plus 3 more bytes of padding?), it seems like we might as well snag
one more bit for clarity.

Thoughts?

...Robert

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2010-04-22 16:15:01 Re: BETA
Previous Message Robert Haas 2010-04-22 15:10:34 Re: Thread safety and libxml2