Re: [PATCHES] Infrastructure changes for recovery (v8)

From: "Fujii Masao" <masao(dot)fujii(at)gmail(dot)com>
To: "Simon Riggs" <simon(at)2ndquadrant(dot)com>
Cc: "Heikki Linnakangas" <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCHES] Infrastructure changes for recovery (v8)
Date: 2008-11-18 01:53:13
Message-ID: 3f0b79eb0811171753u76c2b78du8e77acfc3b5280c0@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

On Tue, Nov 18, 2008 at 12:39 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>
> On Mon, 2008-11-17 at 16:18 +0200, Heikki Linnakangas wrote:
>> Simon Riggs wrote:
>> > @@ -3845,6 +3850,52 @@ sigusr1_handler(SIGNAL_ARGS)
>> >
>> > PG_SETMASK(&BlockSig);
>> >
>> > + if (CheckPostmasterSignal(PMSIGNAL_RECOVERY_START))
>> > + {
>> > + Assert(pmState == PM_STARTUP);
>> > +
>> > + /*
>> > + * Go to shutdown mode if a shutdown request was pending.
>> > + */
>> > + if (Shutdown > NoShutdown)
>> > + {
>> > + pmState = PM_WAIT_BACKENDS;
>> > + /* PostmasterStateMachine logic does the rest */
>> > + }
>> > + else
>> > + {
>> > + /*
>> > + * Startup process has entered recovery
>> > + */
>> > + pmState = PM_RECOVERY;
>>
>>
>> Hmm, I smell a race condition here:
>>
>> 1. Startup process goes into consistent state, and signals postmaster
>> 2. Startup process finishes WAL replay and dies
>> 3. Postmaster wakes up in reaper(), noting that the startup process
>> dies, and goes into PM_RUN mode.
>> 4. The above signal handler for postmaster is run, causing an assertion
>> failure, or putting postmaster back into PM_RECOVERY mode if assertions
>> are disabled.
>>
>> Highly unlikely in practice, given how much code needs to run in the
>> startup process between signaling the postmaster and exiting, but it
>> seems theoretically possible. Do we care, and if we do, how can we fix it?
>
> Might be possible - it does depend on the sequence of actions its true.
> Agree not likely to happen, except as the result of another bug.
>
> I'll change it to a test for
>
> if (pmState == PM_STARTUP)
> pmState = PM_RECOVERY;

Likewise, should we also change the assertion against the pid of the
background process (BgWriterPID, PgStatPID)?

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Emmanuel Cecchet 2008-11-18 02:00:03 Re: Transactions and temp tables
Previous Message Paul Schlie 2008-11-18 01:00:35 Re: Block-level CRC checks

Browse pgsql-patches by date

  From Date Subject
Next Message Peter Eisentraut 2008-11-18 13:15:41 Re: Solaris ident authentication using unix domain sockets
Previous Message Simon Riggs 2008-11-17 15:39:42 Re: [PATCHES] Infrastructure changes for recovery (v8)