Skip site navigation (1) Skip section navigation (2)

Re: [patch] helps fe-connect.c handle -EINTR more gracefully

From: David Ford <david(at)blue-labs(dot)org>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [patch] helps fe-connect.c handle -EINTR more gracefully
Date: 2001-10-27 00:15:30
Message-ID: 3BD9FCA2.90903@blue-labs.org (view raw or flat)
Thread:
Lists: pgsql-hackers
>
>
>
>No, it should *not* look like that.  The fe-connect.c code is designed
>to move on as soon as it's convinced that the kernel has accepted the
>connection request.  We use a non-blocking connect() call and later
>wait for connection complete by probing the select() status.  Looping
>on the connect() itself would be a busy-wait, which would be antisocial.
>

The fe-connect.c code moves on regardless of the completion of the 
connect() if it has been interrupted.

To simplify, in a program without SIGALRM events, PQconnect* won't be 
interrupted.  The connect() call will complete properly.

In a program with SIGALRM events, the call is interrupted inside 
connect().  If SA_RESTART was disabled for connect() in POSIX semantics, 
the  program would automatically jump right back into the connect() 
call.  However by default POSIX code enables SA_RESTART which for 
SIGALRM means -don't- automatically restart the system call.  This means 
the programmer needs to check for -1/errno=EINTR and jump back into 
connect() himself.  There isn't a concern for busy wait/anti social code 
behavior, your program was in the middle of connect() when it was 
interrupted, you're simply jumping back to where you left off.

It doesn't matter if it is a blocking connect or non-blocking connect, 
handling EINTR must be done if SIGALRM events are employed.  A fast 
enough event timer with a non-blocking connect will also be susceptible 
to EINTR.

EINTR is distinctly different from EINPROGRESS.  If they were the same 
then there would be a problem.  EINTR should be handled by jumping back 
into the connect() call, it is re-entrant and designed for this.

Regardless, you don't wait for the connection to complete, the code 
following the connect() call returns failure for every -1 result from 
connect() unless it is EINPROGRESS or EWOULDBLOCK.  select() is -not- 
used in fe-connect.c.  It is possible with the current code for the 
connection to fail in non-blocking mode.  Reason: you call connect() in 
non-blocking mode, break out of the section on EINPROGRESS, and continue 
assuming that the connection will be successful.

       EINPROGRESS
              The  socket is non-blocking and the connection can
              not be completed immediately.  It  is  possible  to
              select(2)  or  poll(2)  for completion by selecting
              the socket  for  writing.  After  select  indicates
              writability, use getsockopt(2) to read the SO_ERROR
              option at level  SOL_SOCKET  to  determine  whether
              connect  completed  successfully (SO_ERROR is zero)
              or unsuccessfully (SO_ERROR is  one  of  the  usual
              error  codes listed here, explaining the reason for
              the failure).

The socket is not checked any further after the connect().  The code 
should not continue on into the SSL handling until you're sure that the 
socket is ready for operation.

The reason why I am getting EINTR from a non-blocking connect is because 
my event timer happens to fire in the middle of the connect() call. 
 Just because you set the socket to FIONBIO doesn't mean that connect() 
can't be interrupted.

David



Responses

pgsql-hackers by date

Next:From: Thomas LockhartDate: 2001-10-27 00:32:51
Subject: Re: 7.2b1 ...
Previous:From: Bruce MomjianDate: 2001-10-26 23:59:12
Subject: configure --enable-unicode

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group