Re: BUG #5837: PQstatus() fails to report lost connection

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "Murray S(dot) Kucherawy" <msk(at)cloudmark(dot)com>, pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #5837: PQstatus() fails to report lost connection
Date: 2011-01-23 03:44:04
Message-ID: 1950.1295754244@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Thu, Jan 13, 2011 at 8:36 PM, Murray S. Kucherawy <msk(at)cloudmark(dot)com> wrote:
>> 1) establish a connection to postgresql
>> 2) initiate a query, collect results, etc.; all normal
>> 3) while client is idle, restart the server
>> 4) initiate the very same query as before
>> 5) call PQgetResult(), returns non-NULL
>> 6) call PQresultStatus(), returns PGRES_FATAL_ERROR
>> 7) call PQstatus(), returns CONNECTION_OK

> I can reproduce this by hacking up src/test/examples/testlibpq to
> loop, but I'm not totally sure what's causing the behavior.

What did you do exactly? testlibpq.c just uses PQexec(), and AFAICS the
connection status does end up BAD if the backend is terminated before a
PQexec starts.

> I think
> the problem may be that libpq only reads enough from the connection to
> get the FATAL error. It doesn't keep reading to see whether there's
> an EOF afterward, and thus doesn't immediately realize that the
> connection has been closed.

I think the OP's mistake is to assume that the first PQgetResult ought
to set this. As you say, it hasn't discovered the EOF condition at the
time it returns that error-message result, and we are certainly not
going to add another kernel call to every query cycle to check for EOF.

The reason I don't see a problem when using PQexec is that PQexec will
internally do another PQgetResult, and it's the second one that will
fail and reset the connection status. In a non-connection-termination
situation, the second internal PQgetResult call consumes the
ReadyForQuery message, and at that point we fall out of PQexec (without
any extra kernel call). Here, though, there won't be a ReadyForQuery,
and it's the read() to try to collect one that discovers the loss of
connection.

In short: I don't think there's a bug here, just failure to understand
proper use of PQgetResult.

regards, tom lane

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message XiaoboGu 2011-01-23 15:01:30 答复: [HACKERS] Is there a way to build PostgreSQL client libraries with MinGW
Previous Message Robert Haas 2011-01-23 02:51:33 Re: BUG #5837: PQstatus() fails to report lost connection