Re: How should the primary behave when the sync standby goes away? Re: Sync Rep v17

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org, Daniel Farina <daniel(at)heroku(dot)com>
Subject: Re: How should the primary behave when the sync standby goes away? Re: Sync Rep v17
Date: 2011-03-07 18:15:59
Message-ID: AANLkTikvbQ+aJ2xJ_+hMTxUtV3F7XpD_Fv0bprMjvQi7@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Mar 6, 2011 at 5:36 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> On Fri, 2011-03-04 at 16:57 +0900, Fujii Masao wrote:
>> On Wed, Mar 2, 2011 at 11:30 PM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
>> > On Wed, Mar 2, 2011 at 8:22 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>> >> The WALSender deliberately does *not* wake waiting users if the standby
>> >> disconnects. Doing so would break the whole reason for having sync rep
>> >> in the first place. What we do is allow a potential standby to takeover
>> >> the role of sync standby, if one is available. Or the failing standby
>> >> can reconnect and then release waiters.
>> >
>> > If there is potential standby when synchronous standby has gone, I agree
>> > that it's not good idea to release the waiting backends soon. In this case,
>> > those backends should wait for next synchronous standby.
>> >
>> > On the other hand, if there is no potential standby, I think that the waiting
>> > backends should not wait for the timeout and should wake up as soon as
>> > synchronous standby has gone. Otherwise, those backends suspend for
>> > a long time (i.e., until the timeout expires), which would decrease the
>> > high-availability, I'm afraid.
>> >
>> > Keeping those backends waiting for the failed standby to reconnect is an
>> > idea. But this looks like the behavior for "allow_standalone_primary = off".
>> > If allow_standalone_primary = on, it looks more natural to make the
>> > primary work alone without waiting the timeout.
>>
>> Also I think that the waiting backends should be released as soon as the
>> last synchronous standby switches to asynchronous mode. Since there is
>> no standby which is planning to reconnect, obviously they no longer need
>> to wait.
>
> I've not done this, but we could.
>
> It can't run in a WALSender, so this code would need to live in either
> WALWriter or BgWriter.

I would have thought that the last WALSender to switch to async would
have been responsible for doing this at that time. Why doesn't that
work?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2011-03-07 18:16:31 Re: [HACKERS] Sync rep doc corrections
Previous Message Robert Haas 2011-03-07 18:13:33 Re: Column-level trigger doc typo fix