Re: Stefan's bug (was: max_standby_delay considered harmful)

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Florian Pflug <fgp(at)phlo(dot)org>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Bruce Momjian <bruce(at)momjian(dot)us>, Greg Smith <greg(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: Stefan's bug (was: max_standby_delay considered harmful)
Date: 2010-05-13 02:41:53
Message-ID: AANLkTimSVXSqhAT8J81IZX3eD0tl8PffbROjuFMot-Ks@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, May 12, 2010 at 10:36 PM, Alvaro Herrera
<alvherre(at)alvh(dot)no-ip(dot)org> wrote:
> Excerpts from Robert Haas's message of mié may 12 20:48:41 -0400 2010:
>> On Wed, May 12, 2010 at 3:55 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> > I am wondering if we are not correctly handling the case where we get
>> > a shutdown request while we are still in the PM_STARTUP state.  It
>> > looks like we might go ahead and switch to PM_RECOVERY and then
>> > PM_RECOVERY_CONSISTENT without noticing the shutdown.  There is some
>> > logic to handle the shutdown when the startup process exits, but if
>> > the startup process never exits it looks like we might get stuck.
>>
>> I can reproduce the behavior Stefan is seeing consistently using the
>> attached patch.
>>
>> W1: postgres -D ~/pgslave
>> W2: pg_ctl -D ~/pgslave stop
>
> If there's anything to learn from this patch, is that sleep is
> uninterruptible on some platforms.  This is why sleeps elsewhere are
> broken down in loops, sleeping in small increments and checking
> interrupts each time.  Maybe some of the sleeps in the new HS code need
> to be handled this way?

I don't think the problem is that the sleep is uninterruptible. I
think the problem is that a smart shutdown request received while in
the PM_STARTUP state does not acted upon until we enter the PM_RUN
state. That is, there's a race condition between the SIGUSR1 that the
startup process sends to the postmaster to signal that recovery has
begun and the SIGTERM being sent by pg_ctl.

However, since I haven't succeeded in producing a fix yet, take that
with a grain of salt...

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2010-05-13 02:46:44 Re: max_standby_delay considered harmful
Previous Message Alvaro Herrera 2010-05-13 02:36:54 Re: Stefan's bug (was: max_standby_delay considered harmful)