Re: Interruptible sleeps (was Re: CommitFest 2009-07: Yay, Kevin! Thanks, reviewers!)

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Interruptible sleeps (was Re: CommitFest 2009-07: Yay, Kevin! Thanks, reviewers!)
Date: 2010-09-03 22:24:18
Message-ID: 974.1283552658@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> Color me confused; I may need to backpedal rapidly here. I had
> thought that what Tom was complaining about was the fact that the
> signal handler was taking LWLocks, which I would have thought to be
> totally unsafe.

Well, it's unsafe if the signal could interrupt mainline code that is
trying to take an LWLock or already holds the same LWLock --- or, if you
consider deadlock risks against other processes, already holding other
LWLocks might be problematic too. And trying to use facilities like
elog or palloc is also pretty unsafe if the mainline is too.

The reason the deadlock checker is okay is that there is only a *very*
narrow range of mainline code that it could possibly be interrupting,
basically nothing but the PGSemaphoreLock() call in ProcSleep.
(There's a lot of other code inside the loop there, but notice that it's
protected by tests that ensure that it won't run unless the deadlock
checker already did.)

One salient thing about ProcSleep is that you shouldn't call it while
holding any LWLocks, and another is that if you're sleeping while
holding regular heavyweight locks, the deadlock checker is exactly what
will get you out of trouble if there's a deadlock.

Now the HS case likewise appears to be set up so that the signal can
only directly interrupt ProcWaitForSignal, so I think the core issue is
whether any deadlock situations are possible. Given that this gets
called from a low-level place like LockBufferForCleanup, I don't feel
too comfortable about that. I certainly haven't seen any analysis or
documentation of what locks can safely be held at that point.
The deadlock checker only tries to take the LockMgr LWLocks, so
extrapolating from whether it is safe to whether touching the
ProcArrayLock is safe seems entirely unfounded.

It might be worth pointing out here that LockBufferForCleanup is already
known to be a risk factor for undetected deadlocks, even without HS in
the picture, because of the possibility of deadlocks involving a chain
of both heavyweight locks and LWLocks. Whether HS makes it materially
worse may be something that we need field experience to determine.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2010-09-03 22:26:06 Re: Windows Tools
Previous Message Tom Lane 2010-09-03 21:53:53 Re: Cost estimates for parameterized paths