Re: [patch] helps fe-connect.c handle -EINTR more gracefully

From: David Ford <david(at)blue-labs(dot)org>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [patch] helps fe-connect.c handle -EINTR more gracefully
Date: 2001-10-27 02:08:05
Message-ID: 3BDA1705.9020207@blue-labs.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>
>
>
>I think you are missing the point. I am not saying that we shouldn't
>deal with EINTR; rather I am raising what I think is a legitimate
>question: *what* is the most appropriate response? My reading of
>HP's gloss suggests that we could treat EINTR the same as EINPROGRESS,
>ie, consider the connect() to have succeeded and move on to the
>wait-for-connection-complete-or-failure-using-select() phase.
>If that works I would prefer it to repeating the connect() call,
>primarily because it avoids any possibility of introducing an
>infinite loop.
>

You wouldn't get an infinite loop, you'd get Exx indicating the
operation was in progress. Yes, you could spin on a select() waiting
for the end result. What I normally do is this:

connect()

while(select()) {
switch () {
case EINTR:
break;
case EINPROGRESS:
nanosleep();
break;
case ETIMEDOUT:
default:
/* handle timeout and other error conditions nicely for the
user */
break;
}
}

With EINTR, it's fine to immediately start working again because your
code was interrupted from outside this scope. We don't know where in
connect() we were interrupted, blocking or non-blocking. With
EINPROGRESS I sleep for a while to be nice to the system. Here we know
that things are moving along like they should be and we are in a proper
sleepable period.

That isn't to imply that things will break if we sleep from EINTR. Only
that connect() exited due to an interruption, not due to planning.

>
>No, we don't. If you think that, then you haven't studied the code
>sufficiently to be submitting patches for it. What we actually do
>is exactly what is recommended by the manpage you're quoting at me.
>It's just split across multiple routines.
>

I traced several calls and they run through a few functions which end
up in pqFlush. These code paths haven't checked the socket to see if it
is ready for RW operation yet. pqFlush calls send() [ignoring SSL].

Only after a lot of code has been traversed is pqWait run in which the
socket is checked for RW and EINTR. My point that I was bringing up
with Peter was that it's much more nice on the system to wait for the
socket to become usable before going through all that code. In the
previous email I suggested that with a sufficiently fast timer event,
you'd never get back through the PQconnect* before being interrupted
again and that's why I advocate putting the EINTR as close to the
connect() as possible. Tying this together is why it is possible to
fail, a good amount of code is traversed before you get back to dealing
with the socket. Anywhere inbetween this signal events can happen again.

That's what provoked this original patch. Unless I shut off my timer or
changed my timer to happen in the long distant future, I would never
have a successful connection established.

David

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Dyas 2001-10-27 02:52:50 consistent naming of components
Previous Message Tom Lane 2001-10-27 01:43:32 Re: [patch] helps fe-connect.c handle -EINTR more gracefully