Re: Interruptible sleeps (was Re: CommitFest 2009-07: Yay, Kevin! Thanks, reviewers!)

From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Interruptible sleeps (was Re: CommitFest 2009-07: Yay, Kevin! Thanks, reviewers!)
Date: 2010-09-03 20:20:46
Message-ID: 4C81589E.8030105@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 03/09/10 19:38, Robert Haas wrote:
> On Fri, Sep 3, 2010 at 12:10 PM, Tom Lane<tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Robert Haas<robertmhaas(at)gmail(dot)com> writes:
>>> On Fri, Sep 3, 2010 at 10:07 AM, Tom Lane<tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>>> [ shrug... ] I stated before that the Hot Standby patch is doing
>>>> utterly unsafe things in signal handlers. Simon rejected that.
>>>> I am waiting for irrefutable evidence to emerge from the field
>>>> (and am very confident that it will be forthcoming...) before
>>>> I argue with him further. Meanwhile, I'm not going to accept anything
>>>> unsafe in a core facility like this patch is going to be.
>>
>>> Oh. I thought you had ignored his objections and fixed it. Why are
>>> we releasing 9.0 with this problem again? Surely this is nuts.
>>
>> My original review of hot standby found about half a dozen things
>> I thought were broken:
>> http://archives.postgresql.org/pgsql-hackers/2010-05/msg00178.php
>> After a *very* long-drawn-out fight I fixed one of them
>> (max_standby_delay), largely still over Simon's objections. I don't
>> have the energy to repeat that another half-dozen times, so I'm going
>> to wait for the suspected problems to be proven by field experience.
>
> Bummer. Allow me to cast a vote for doing something about the fact
> that handle_standby_sig_alarm() thinks it can safely acquire an LWLock
> in a signal handler. I think we should be making our decisions on
> what to change in the code based on what is technically sound, rather
> than based on how much the author complains about changing it. Of
> course there may be cases where there is a legitimate difference of
> opinion concerning the best way forward, but I don't believe this is
> one of them.

Hmm, just to make the risk more concrete, here's one scenario that could
happen:

1. Startup process tries to acquire cleanup lock on a page. It's pinned,
so it has to wait, and calls ResolveRecoveryConflictWithBufferPin().
2. ResolveRecoveryConflictWithBufferPin enables the standby SIGALRM
handler by calling enable_standby_sig_alarm(), and calls
ProcWaitForSignal().

3. ProcWaitForSignal() calls semop() (assuming sysv semaphores here) to
wait for the process semaphore

4. Max standby delay is reached and SIGALRM fired. CheckStandbyTimeout()
is called in signal handler. CheckStandbyTimeout() calls
SendRecoveryConflictWithBufferPin(), which calls CancelDBBackends()

5. CancelDBBackends() tries to acquire ProcArrayLock in exclusive mode.
It's being held by another process, so we have to sleep

6. To sleep, LWLockAcquire calls PGSemaphoreLock, which calls semop() to
wait on on the process semaphore.

So we now have the same process nested twice inside a semop() call.
Looking at the Linux signal (7) man page, it is undefined what happens
if semop() is re-entered in a signal handler while another semop() call
is happening in main line of execution. Assuming it somehow works, which
semop() call is going to return when the semaphore is incremented?

Maybe that's ok, if I'm reading the deadlock checker code correctly, it
also calls semop() to increment the another process' semaphore, and the
deadlock checker can be invoked from a signal handler while in semop()
to wait on our process' semaphore. BTW, sem_post(), which is the Posix
function for incrementing a semaphore, is listed as a safe function to
call in a signal handler. But it's certainly fishy.

A safer approach would be to just PGSemaphoreUnlock() in the signal
handler, and do all the other processing outside it. You'd still call
semop() within semop(), but at least it would be closer to the semop()
within semop() we already do in the deadlock checker. And there would be
less randomness from timing and lock contention involved, making it
easier to test the behavior on various platforms.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2010-09-03 20:25:17 Re: Cost estimates for parameterized paths
Previous Message Tom Lane 2010-09-03 20:18:54 Re: returning multiple result sets from a stored procedure