Re: pending patch: Re: HS/SR and smart shutdown

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: pending patch: Re: HS/SR and smart shutdown
Date: 2010-04-01 10:48:12
Message-ID: l2m603c8f071004010348mf259f3e5yc7510de075f0a929@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Apr 1, 2010 at 4:42 AM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
> On Thu, Apr 1, 2010 at 12:16 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> On Wed, Mar 31, 2010 at 5:02 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>>>> > >From what I have seen, the comment about PM_WAIT_BACKENDS is incorrect.
>>>> > "backends might be waiting for the WAL record that conflicts with their
>>>> > queries to be replayed". Recovery sometimes waits for backends, but
>>>> > backends never wait for recovery.
>>>>
>>>> Really? As Heikki explained before, backends might wait for the lock
>>>> taken by the startup process.
>>>> http://archives.postgresql.org/pgsql-hackers/2010-01/msg02984.php
>>>
>>> Backends wait for locks, yes, but they could be waiting for user locks
>>> also. That is not "waiting for the WAL record", that concept does not
>>> exist.
>>
>> Hmm... this is a good point, on two levels.  First, the comment is not
>> as well-phrased as it could be.  Second, I wonder why we can't kill
>> the startup process and WAL receiver right away, and then wait for the
>> backends to die off afterwards.
>
> I tested whether killing the startup process and walreceiver releases
> the lock which the backends are waiting for. Unfortunately it doesn't,
> and the backends have gotten stuck in my box. The behavior which the
> startup process shuts down without releasing the lock is a bug?

I think that what this shows is that the original design of Hot
Standby didn't contemplate ever having Hot Standby up without the
startup process running. In retrospect, maybe we want to allow that,
because a smart shutdown would be more likely to complete in a timely
fashion if we stopped replication first and then waited for the
backends to die rather than waiting for the backends to die first and
then stopping replication. That's because, for so long as replication
continues, it may take new locks as well as releasing old ones, to say
nothing of using other system resources like CPU and I/O bandwidth.
But, for 9.0, I'm not sure we have any real choice, unless making the
startup process release locks when it goes away is a very simple
change. Assuming that's not the case, I think we should apply this
patch with some updates to the comments, document how it works and
that it may change in a future release, and add a TODO for 9.1.

Thoughts?

...Robert

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2010-04-01 11:18:12 Re: pending patch: Re: HS/SR and smart shutdown
Previous Message Peter Eisentraut 2010-04-01 10:15:21 missing schema qualifications in psql