Re: Restartable signals 'n all that

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Restartable signals 'n all that
Date: 2007-07-17 23:48:07
Message-ID: 200707172348.l6HNm7s06086@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


This has been saved for the 8.4 release:

http://momjian.postgresql.org/cgi-bin/pgpatches_hold

---------------------------------------------------------------------------

Tom Lane wrote:
> While poking at the vacuum-launcher issue currently under discussion,
> I got annoyed again at the behavior we've known for a long while that
> on some platforms pg_usleep() won't be interrupted by signals. (In
> particular I see this on HPUX, though not on Linux or Darwin. Anyone
> know if it happens on any BSDen?) I noticed that with the launcher set
> to sleep at most one second between checks for signals, it seemed to
> *always* take the full second before shutting down, which seemed awfully
> unlucky.
>
> Some more testing and man-page-reading revealed the full truth of what's
> going on. The Single Unix Spec's select(2) page says under ERRORS
>
> [EINTR]
> The select() function was interrupted before any of the selected events
> occurred and before the timeout interval expired. If SA_RESTART has been
> set for the interrupting signal, it is implementation-dependent whether
> select() restarts or returns with [EINTR].
>
> Since pqsignal() sets SA_RESTART for all trapped signals except SIGALRM,
> that means we are exposing ourselves to the implementation dependency.
> What I furthermore realized while tracing is that "restart" means
> "start counting down the full timeout interval over again". Thus, if
> we have told select() to time out after 1 second, and SIGINT arrives
> after 0.9 second, we will wait a full second more before responding.
>
> Bad as that is, it gets worse rapidly: each new signal arrival restarts
> the cycle. So a continuous flow of signals at a spacing of less than
> 1 second would prevent the delay from *ever* terminating.
>
> This may be why some kernels reduce the timeout value before returning,
> so that a "restart" behavior in userland behaves sanely. But that's
> not what's happening for me :-(.
>
> To me, this calls into question whether we should try to avoid using
> SA_RESTART at all. The reason for doing it of course is to avoid
> unexpected syscall EINTR failures as well as short read/short write
> behaviors during disk I/O. However, if that's the plan then what the
> heck is pqsignal() doing giving an exception for SIGALRM? As soon as
> you have even one non-restartable trapped signal, it seems you need
> to handle EINTR everywhere.
>
> I looked into the CVS history and found that we inherited the SIGALRM
> exception from Berkeley (in fact it's in the v4r2 sources from 1994).
> Back then the system's usage of SIGALRM was pretty darn narrow --- it
> was used only to trigger the deadlock checker, which means it applied
> only while waiting for a lock, and the range of code in which the
> interrupt could occur was just a few lines. Now that we use SIGALRM for
> statement_timeout, the interrupt can potentially happen almost anywhere
> in the backend code.
>
> So we've got two problems: SA_RESTART is preventing some EINTRs from
> happening when we'd like, and yet it seems we are at risk of unwanted
> EINTRs anyway.
>
> The only really clean solution I can see is to stop using SA_RESTART
> and try to make all our syscalls EINTR-proof. But the probability
> of bugs-of-omission seems just about 100%, especially in third party
> backend add-ons that we don't get to review the code for.
>
> If we do nothing, anyone using statement_timeout is at risk. The
> risk is somewhat ameliorated by the fact that occurrence of the
> interrupt means transaction cancellation anyway, so an unexpected
> error of some other type isn't really a fatal problem. But it's
> still a bit nervous-making. I don't currently see a way to get
> corrupted data from an EINTR (bufmgr is fairly robust about write
> failures, for instance) but ...
>
> If we decide to live with that, and fix any reported problems, then
> one thing we could do to ameliorate the sleep problem is to turn
> off SA_RESTART for all activity-cancelling interrupts, in particular
> SIGINT/SIGTERM/SIGQUIT. This wouldn't make it safe for bgwriter
> and friends to go back to long sleep intervals, because they are
> watching for other interrupts too that don't represent reasons to
> cancel transactions. But it would at least solve the problem of
> slow response to shutdown requests.
>
> Comments? I sure hope someone has a better idea.
>
> regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: don't forget to increase your free space map settings

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Brendan Jurd 2007-07-18 00:00:44 Re: Invalid to_date patterns (was: [PATCHES] [GENERAL] ISO week dates)
Previous Message Bruce Momjian 2007-07-17 23:23:16 Re: AutoVacuum Behaviour Question