From: | "Fujii Masao" <masao(dot)fujii(at)gmail(dot)com> |
---|---|
To: | "Simon Riggs" <simon(at)2ndquadrant(dot)com> |
Cc: | "Heikki Linnakangas" <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: [PATCHES] Infrastructure changes for recovery (v8) |
Date: | 2008-11-18 01:53:13 |
Message-ID: | 3f0b79eb0811171753u76c2b78du8e77acfc3b5280c0@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers pgsql-patches |
On Tue, Nov 18, 2008 at 12:39 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>
> On Mon, 2008-11-17 at 16:18 +0200, Heikki Linnakangas wrote:
>> Simon Riggs wrote:
>> > @@ -3845,6 +3850,52 @@ sigusr1_handler(SIGNAL_ARGS)
>> >
>> > PG_SETMASK(&BlockSig);
>> >
>> > + if (CheckPostmasterSignal(PMSIGNAL_RECOVERY_START))
>> > + {
>> > + Assert(pmState == PM_STARTUP);
>> > +
>> > + /*
>> > + * Go to shutdown mode if a shutdown request was pending.
>> > + */
>> > + if (Shutdown > NoShutdown)
>> > + {
>> > + pmState = PM_WAIT_BACKENDS;
>> > + /* PostmasterStateMachine logic does the rest */
>> > + }
>> > + else
>> > + {
>> > + /*
>> > + * Startup process has entered recovery
>> > + */
>> > + pmState = PM_RECOVERY;
>>
>>
>> Hmm, I smell a race condition here:
>>
>> 1. Startup process goes into consistent state, and signals postmaster
>> 2. Startup process finishes WAL replay and dies
>> 3. Postmaster wakes up in reaper(), noting that the startup process
>> dies, and goes into PM_RUN mode.
>> 4. The above signal handler for postmaster is run, causing an assertion
>> failure, or putting postmaster back into PM_RECOVERY mode if assertions
>> are disabled.
>>
>> Highly unlikely in practice, given how much code needs to run in the
>> startup process between signaling the postmaster and exiting, but it
>> seems theoretically possible. Do we care, and if we do, how can we fix it?
>
> Might be possible - it does depend on the sequence of actions its true.
> Agree not likely to happen, except as the result of another bug.
>
> I'll change it to a test for
>
> if (pmState == PM_STARTUP)
> pmState = PM_RECOVERY;
Likewise, should we also change the assertion against the pid of the
background process (BgWriterPID, PgStatPID)?
Regards,
--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
From | Date | Subject | |
---|---|---|---|
Next Message | Emmanuel Cecchet | 2008-11-18 02:00:03 | Re: Transactions and temp tables |
Previous Message | Paul Schlie | 2008-11-18 01:00:35 | Re: Block-level CRC checks |
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Eisentraut | 2008-11-18 13:15:41 | Re: Solaris ident authentication using unix domain sockets |
Previous Message | Simon Riggs | 2008-11-17 15:39:42 | Re: [PATCHES] Infrastructure changes for recovery (v8) |