Re: [PATCHES] [patch] fe-connect.c doesn't handle EINTR correctly

From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc: David Ford <david+cert(at)blue-labs(dot)org>, pgsql-patches(at)postgresql(dot)org, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCHES] [patch] fe-connect.c doesn't handle EINTR correctly
Date: 2002-04-15 23:34:14
Message-ID: 200204152334.g3FNYEO05629@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches


Fix applied. Thanks.

---------------------------------------------------------------------------

Bruce Momjian wrote:
>
> David, sorry you patch didn't make it into 7.2.X. That whole EINTR
> discussion was quite complicated so I am not surprised we missed it.
>
> The attached patch implements your ENITR test in cases that seems to
> need it. I have followed the method we used for ENITRY in fe-misc.c.
>
>
> ---------------------------------------------------------------------------
>
> David Ford wrote:
> > Last year we had a drawn out discussion about this and I created a patch
> > for it. I never noticed that the patch didn't go in until I installed
> > 7.2 the other day and realised that fe-connect.c never was fixed.
> >
> > Here is the patch again. It is against CVS 3/16/2002. This time I only
> > rewrote the connect procedure at line 912, I leave it up to the regular
> > hackers to copy it's functionality to the SSL procedure just below it.
> >
> > In summary, if a software writer implements timer events or other events
> > which generate a signal with a timing fast enough to occur while libpq
> > is inside connect(), then connect returns -EINTR. The code following
> > the connect call does not handle this and generates an error message.
> > The sum result is that the pg_connect() fails. If the timer or other
> > event is right on the window of the connect() completion time, the
> > pg_connect() may appear to work sporadically. If the event is too slow,
> > pg_connect() will appear to always work and if the event is too fast,
> > pg_connect() will always fail.
> >
> > David
> >
>
> > Index: src/interfaces/libpq/fe-connect.c
> > ===================================================================
> > RCS file: /projects/cvsroot/pgsql/src/interfaces/libpq/fe-connect.c,v
> > retrieving revision 1.181
> > diff -u -r1.181 fe-connect.c
> > --- src/interfaces/libpq/fe-connect.c 2001/11/11 02:09:05 1.181
> > +++ src/interfaces/libpq/fe-connect.c 2002/03/16 05:17:47
> > @@ -909,29 +909,48 @@
> > * Thus, we have to make arrangements for all eventualities.
> > * ----------
> > */
> > - if (connect(conn->sock, &conn->raddr.sa, conn->raddr_len) < 0)
> > - {
> > - if (SOCK_ERRNO == EINPROGRESS || SOCK_ERRNO == EWOULDBLOCK || SOCK_ERRNO == 0)
> > - {
> > - /*
> > - * This is fine - we're in non-blocking mode, and the
> > - * connection is in progress.
> > - */
> > - conn->status = CONNECTION_STARTED;
> > - }
> > - else
> > - {
> > - /* Something's gone wrong */
> > - connectFailureMessage(conn, SOCK_ERRNO);
> > - goto connect_errReturn;
> > + do {
> > + int e;
> > + e=connect(conn->sock, &conn->raddr.sa, conn->raddr_len)
> > +
> > + if(e < 0) {
> > + switch (e) {
> > + case EINTR:
> > + /*
> > + * Interrupted by a signal, keep trying. This handling is
> > + * required because the user may have turned on signals in
> > + * his program. Previously, libpq would erronously fail to
> > + * connect if the user's timer event fired and interrupted
> > + * this syscall. It is important that we don't try to sleep
> > + * here because this may cause havoc with the user program.
> > + */
> > + continue;
> > + break;
> > + case 0:
> > + case EINPROGRESS:
> > + case EWOULDBLOCK:
> > + /*
> > + * This is fine - we're in non-blocking mode, and the
> > + * connection is in progress.
> > + */
> > + conn->status = CONNECTION_STARTED;
> > + break;
> > + default:
> > + /* Something's gone wrong */
> > + connectFailureMessage(conn, SOCK_ERRNO);
> > + goto connect_errReturn;
> > + break;
> > + }
> > + } else {
> > + /* We're connected now */
> > + conn->status = CONNECTION_MADE;
> > }
> > - }
> > - else
> > - {
> > - /* We're connected already */
> > - conn->status = CONNECTION_MADE;
> > - }
> > +
> > + if(conn->status == CONNECTION_STARTED || conn->status == CONNECTION_MADE)
> > + break;
> >
> > + } while(1);
> > +
> > #ifdef USE_SSL
> > /* Attempt to negotiate SSL usage */
> > if (conn->allow_ssl_try)
>
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 5: Have you checked our extensive FAQ?
> >
> > http://www.postgresql.org/users-lounge/docs/faq.html
>
> --
> Bruce Momjian | http://candle.pha.pa.us
> pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 853-3000
> + If your life is a hard drive, | 830 Blythe Avenue
> + Christ can be your backup. | Drexel Hill, Pennsylvania 19026

> Index: src/interfaces/libpq/fe-connect.c
> ===================================================================
> RCS file: /cvsroot/pgsql/src/interfaces/libpq/fe-connect.c,v
> retrieving revision 1.182
> diff -c -r1.182 fe-connect.c
> *** src/interfaces/libpq/fe-connect.c 2 Mar 2002 00:49:22 -0000 1.182
> --- src/interfaces/libpq/fe-connect.c 14 Apr 2002 04:40:24 -0000
> ***************
> *** 913,920 ****
> --- 913,925 ----
> * Thus, we have to make arrangements for all eventualities.
> * ----------
> */
> + retry1:
> if (connect(conn->sock, &conn->raddr.sa, conn->raddr_len) < 0)
> {
> + if (SOCK_ERRNO == EINTR)
> + /* Interrupted system call - we'll just try again */
> + goto retry1;
> +
> if (SOCK_ERRNO == EINPROGRESS || SOCK_ERRNO == EWOULDBLOCK || SOCK_ERRNO == 0)
> {
> /*
> ***************
> *** 949,957 ****
> --- 954,967 ----
> SOCK_STRERROR(SOCK_ERRNO));
> goto connect_errReturn;
> }
> + retry2:
> /* Now receive the postmasters response */
> if (recv(conn->sock, &SSLok, 1, 0) != 1)
> {
> + if (SOCK_ERRNO == EINTR)
> + /* Interrupted system call - we'll just try again */
> + goto retry2;
> +
> printfPQExpBuffer(&conn->errorMessage,
> libpq_gettext("could not receive server response to SSL negotiation packet: %s\n"),
> SOCK_STRERROR(SOCK_ERRNO));
> ***************
> *** 2132,2139 ****
> --- 2142,2153 ----
> "PQrequestCancel() -- socket() failed: ");
> goto cancel_errReturn;
> }
> + retry3:
> if (connect(tmpsock, &conn->raddr.sa, conn->raddr_len) < 0)
> {
> + if (SOCK_ERRNO == EINTR)
> + /* Interrupted system call - we'll just try again */
> + goto retry3;
> strcpy(conn->errorMessage.data,
> "PQrequestCancel() -- connect() failed: ");
> goto cancel_errReturn;
> ***************
> *** 2150,2157 ****
> --- 2164,2175 ----
> crp.cp.backendPID = htonl(conn->be_pid);
> crp.cp.cancelAuthCode = htonl(conn->be_key);
>
> + retry4:
> if (send(tmpsock, (char *) &crp, sizeof(crp), 0) != (int) sizeof(crp))
> {
> + if (SOCK_ERRNO == EINTR)
> + /* Interrupted system call - we'll just try again */
> + goto retry4;
> strcpy(conn->errorMessage.data,
> "PQrequestCancel() -- send() failed: ");
> goto cancel_errReturn;
> Index: src/interfaces/libpq/fe-misc.c
> ===================================================================
> RCS file: /cvsroot/pgsql/src/interfaces/libpq/fe-misc.c,v
> retrieving revision 1.68
> diff -c -r1.68 fe-misc.c
> *** src/interfaces/libpq/fe-misc.c 6 Mar 2002 06:10:42 -0000 1.68
> --- src/interfaces/libpq/fe-misc.c 14 Apr 2002 04:40:25 -0000
> ***************
> *** 361,367 ****
> if (!conn || conn->sock < 0)
> return -1;
>
> ! retry:
> FD_ZERO(&input_mask);
> FD_SET(conn->sock, &input_mask);
> timeout.tv_sec = 0;
> --- 361,367 ----
> if (!conn || conn->sock < 0)
> return -1;
>
> ! retry1:
> FD_ZERO(&input_mask);
> FD_SET(conn->sock, &input_mask);
> timeout.tv_sec = 0;
> ***************
> *** 371,377 ****
> {
> if (SOCK_ERRNO == EINTR)
> /* Interrupted system call - we'll just try again */
> ! goto retry;
>
> printfPQExpBuffer(&conn->errorMessage,
> libpq_gettext("select() failed: %s\n"),
> --- 371,377 ----
> {
> if (SOCK_ERRNO == EINTR)
> /* Interrupted system call - we'll just try again */
> ! goto retry1;
>
> printfPQExpBuffer(&conn->errorMessage,
> libpq_gettext("select() failed: %s\n"),
> ***************
> *** 395,401 ****
> if (!conn || conn->sock < 0)
> return -1;
>
> ! retry:
> FD_ZERO(&input_mask);
> FD_SET(conn->sock, &input_mask);
> timeout.tv_sec = 0;
> --- 395,401 ----
> if (!conn || conn->sock < 0)
> return -1;
>
> ! retry2:
> FD_ZERO(&input_mask);
> FD_SET(conn->sock, &input_mask);
> timeout.tv_sec = 0;
> ***************
> *** 405,411 ****
> {
> if (SOCK_ERRNO == EINTR)
> /* Interrupted system call - we'll just try again */
> ! goto retry;
>
> printfPQExpBuffer(&conn->errorMessage,
> libpq_gettext("select() failed: %s\n"),
> --- 405,411 ----
> {
> if (SOCK_ERRNO == EINTR)
> /* Interrupted system call - we'll just try again */
> ! goto retry2;
>
> printfPQExpBuffer(&conn->errorMessage,
> libpq_gettext("select() failed: %s\n"),
> ***************
> *** 478,484 ****
> }
>
> /* OK, try to read some data */
> ! tryAgain:
> #ifdef USE_SSL
> if (conn->ssl)
> nread = SSL_read(conn->ssl, conn->inBuffer + conn->inEnd,
> --- 478,484 ----
> }
>
> /* OK, try to read some data */
> ! retry3:
> #ifdef USE_SSL
> if (conn->ssl)
> nread = SSL_read(conn->ssl, conn->inBuffer + conn->inEnd,
> ***************
> *** 490,496 ****
> if (nread < 0)
> {
> if (SOCK_ERRNO == EINTR)
> ! goto tryAgain;
> /* Some systems return EAGAIN/EWOULDBLOCK for no data */
> #ifdef EAGAIN
> if (SOCK_ERRNO == EAGAIN)
> --- 490,496 ----
> if (nread < 0)
> {
> if (SOCK_ERRNO == EINTR)
> ! goto retry3;
> /* Some systems return EAGAIN/EWOULDBLOCK for no data */
> #ifdef EAGAIN
> if (SOCK_ERRNO == EAGAIN)
> ***************
> *** 531,537 ****
> (conn->inBufSize - conn->inEnd) >= 8192)
> {
> someread = 1;
> ! goto tryAgain;
> }
> return 1;
> }
> --- 531,537 ----
> (conn->inBufSize - conn->inEnd) >= 8192)
> {
> someread = 1;
> ! goto retry3;
> }
> return 1;
> }
> ***************
> *** 564,570 ****
> * Still not sure that it's EOF, because some data could have just
> * arrived.
> */
> ! tryAgain2:
> #ifdef USE_SSL
> if (conn->ssl)
> nread = SSL_read(conn->ssl, conn->inBuffer + conn->inEnd,
> --- 564,570 ----
> * Still not sure that it's EOF, because some data could have just
> * arrived.
> */
> ! retry4:
> #ifdef USE_SSL
> if (conn->ssl)
> nread = SSL_read(conn->ssl, conn->inBuffer + conn->inEnd,
> ***************
> *** 576,582 ****
> if (nread < 0)
> {
> if (SOCK_ERRNO == EINTR)
> ! goto tryAgain2;
> /* Some systems return EAGAIN/EWOULDBLOCK for no data */
> #ifdef EAGAIN
> if (SOCK_ERRNO == EAGAIN)
> --- 576,582 ----
> if (nread < 0)
> {
> if (SOCK_ERRNO == EINTR)
> ! goto retry4;
> /* Some systems return EAGAIN/EWOULDBLOCK for no data */
> #ifdef EAGAIN
> if (SOCK_ERRNO == EAGAIN)
> ***************
> *** 804,810 ****
>
> if (forRead || forWrite)
> {
> ! retry:
> FD_ZERO(&input_mask);
> FD_ZERO(&output_mask);
> FD_ZERO(&except_mask);
> --- 804,810 ----
>
> if (forRead || forWrite)
> {
> ! retry5:
> FD_ZERO(&input_mask);
> FD_ZERO(&output_mask);
> FD_ZERO(&except_mask);
> ***************
> *** 817,823 ****
> (struct timeval *) NULL) < 0)
> {
> if (SOCK_ERRNO == EINTR)
> ! goto retry;
> printfPQExpBuffer(&conn->errorMessage,
> libpq_gettext("select() failed: %s\n"),
> SOCK_STRERROR(SOCK_ERRNO));
> --- 817,823 ----
> (struct timeval *) NULL) < 0)
> {
> if (SOCK_ERRNO == EINTR)
> ! goto retry5;
> printfPQExpBuffer(&conn->errorMessage,
> libpq_gettext("select() failed: %s\n"),
> SOCK_STRERROR(SOCK_ERRNO));

>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
> http://archives.postgresql.org

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2002-04-15 23:36:04 Re: [GENERAL] Notify argument?
Previous Message Tom Lane 2002-04-15 22:36:23 Re: regression in CVS HEAD

Browse pgsql-patches by date

  From Date Subject
Next Message Bruce Momjian 2002-04-15 23:41:56 Re: TOAST index naming consistency
Previous Message Bistey Csaba 2002-04-15 23:30:23 libpq, psql hungarian translation