Re: [patch] helps fe-connect.c handle -EINTR more gracefully

From: David Ford <david(at)blue-labs(dot)org>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [patch] helps fe-connect.c handle -EINTR more gracefully
Date: 2001-10-26 23:51:48
Message-ID: 3BD9F714.80606@blue-labs.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

"The **SA_RESTART** flag is always set by the underlying system in
POSIX mode so that interrupted system calls will fail with return value
of -1 and the *EINTR* error in /errno/ instead of getting restarted."
libpq's pqsignal.c doesn't turn off SA_RESTART for SIGALRM. Further,
pqsignal.c only handles SIGPIPE and not to mention that other parts of
libpq do handle EINTR properly.

PQconnect* family does not handle EINTR. It does not handle the
possible and perfectly legitimate interruption of a system call.
Globally trying to disable system calls from being interrupted is a Bad
Thing. Having a timer event is common, having a timer event in a daemon
is often required. Timers allow for good housekeeping and playing nice
with the rest of the system.

Your reasonable behavior in the case of EINTR means repeatable and
mysterious failure. There isn't a clean way to re-enter PQconnect*
while maintaining state in the case of signal interruption and no
guarantee the function won't be interrupted again.

Basically if you have a timer event in your software and you use pgsql,
then the following will happen.

a) if the timer event always happens outside the PQconnect* call is
completed your code will function
b) if the timer event always fires during the PQconnect* call, your code
will never function
c) if your timer event sometimes fires during the PQconnect* call, your
code will sometimes function

There are no ifs, ands, or buts about it, if a timer fires inside
PQconnect* as it is now, there is no way to continue. With a suitablly
long timer period, you can try the PQconnect* call again and if the
connect succeeds before the timer fires again you're fine. If not, you
must repeatedly try.

That said, there are two ways about it. a) handle it cleanly inside
PQconnect* like it should be done, or b) have the programmer parse the
error string for "Interrupted system call" and re-enter PQconnect. a)
is clean, short, and simple. b) wastes a lot of CPU to attempt to
accomplish the task. a) is guaranteed and b) is not guaranteed.

David

Peter Eisentraut wrote:

David Ford writes:

>Libpq doesn't deal with system calls being interrupted in the slightest.
> None of the read/write or socket calls handle any errors. Even benign
>returns i.e. EINTR are treated as fatal errors and returned. Not to
>malign, but there is no reason not to continue on and handle EINTR.
>

Libpq certainly does deal with system calls being interrupted: It does
not allow them to be interrupted. Take a look into the file pqsignal.c to
see why.

If your alarm timer interrupts system calls then that's because you have
installed your signal handler to allow that. In my mind, a reasonable
behaviour in that case would be to let the PQconnect or equivalent fail
and provide the errno to the application.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2001-10-26 23:59:12 configure --enable-unicode
Previous Message Lamar Owen 2001-10-26 23:42:53 Re: 7.2b1 ...