Re: BUG #5837: PQstatus() fails to report lost connection

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: "Murray S(dot) Kucherawy" <msk(at)cloudmark(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-bugs(at)postgresql(dot)org" <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: BUG #5837: PQstatus() fails to report lost connection
Date: 2011-01-24 04:58:44
Message-ID: AANLkTi=rSATsN+3UD5feWm3ntsB=6jqAFBZWdTZ7ZfaO@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Sun, Jan 23, 2011 at 10:59 PM, Murray S. Kucherawy <msk(at)cloudmark(dot)com> wrote:
> As for the reply above, I disagree.  PQstatus(), as documented, doesn't say anything about certain conditions in which it won't report that the connection is dead, when it actually is, once the connection was already established and working.

I understand that gut feeling, but I think if you think about it a
little, you may realize that at some level it's an unreasonable
expectation. For example, suppose you were to connect to the server
(successfully), unplug the network cable, and then call PQstatus().
There is literally no way for PQstatus() to know whether that
connection is still there. It's just returning some internal flag out
of the object. And even if we wanted it to do something else, what?
Send a keepalive packet of some sort and wait for a response? Given
default TCP parameter settings, that could take many minutes to give
an answer - which is probably not what people want when they call
PQstatus() - and it would introduce a lot of otherwise unnecessary
network traffic for people who want to query the internal state of the
PGconn, not ping the server.

Now, in this case, things are a bit less clear, because it's not as if
the remote side has given us no indication that the connection is
about to get closed. This doesn't strike me as awesome protocol
design, but 14 years on we're probably stuck with it. I think if you
want to have some special handling when an administrative shutdown
happens on the server, you should use PQresultErrorField() to check
PG_DIAG_SQLSTATE; or if you want FATAL and PANIC conditions to be
handled specially you can check PG_DIAG_SEVERITY.

> Moreover, the description of PQgetResult() doesn't say or illustrate anything about proper use of it in this context, so how would a reader know he/she got it wrong?  The documentation I can find online of PQgetResult() doesn't enumerate the conditions where PQstatus() gives a false indication of whether or not a reconnect is required, nor does any part of the documentation I could find state that PGRES_FATAL_ERROR always implies the connection is no longer usable and must be re-established; "FATAL" could be referring to the transaction/request, not the connection.
>
> So, if this isn't a bug, then I think the documentation needs a bit of work in this area.

The description of PGRES_FATAL_ERROR in the documentation does pretty
much suck. I believe you'll get that if the backend returns either an
ERROR (in which case you need to retry the transaction) or a FATAL (in
which case you need to reset the connection), but that's not at all
clear either from the documentation or the naming of the constant
(which, alas, is hard to change at this point for
backward-compatibility reasons).

The description of PQstatus() looks correct to me. It says that "a
communications failure might result in the status changing to
CONNECTION_BAD prematurely". In the scenario you describe, no
communications failure has occurred. The server sent back an error
message, and closed the connection (though libpq hasn't noticed yet)
but there's no communications error anywhere until the client tries to
send a query over a connection that doesn't exist any more. Maybe we
could flesh that description out a bit to make it more clear that this
is really only going to tell you if TCP/IP has explicitly blown up in
your face, and not any other reason why things might not be working,
although I think there are hints of that there already.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Xiaobo Gu 2011-01-24 05:17:21 Re: [HACKERS] Is there a way to build PostgreSQL client libraries with MinGW
Previous Message Robert Haas 2011-01-24 04:26:49 Re: BUG #5837: PQstatus() fails to report lost connection