Re: libpq and connection failures

From: jtv(at)xs4all(dot)nl
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: jtv(at)xs4all(dot)nl, pgsql-interfaces(at)postgresql(dot)org
Subject: Re: libpq and connection failures
Date: 2005-07-06 07:39:12
Message-ID: 24362.202.47.227.25.1120635552.squirrel@202.47.227.25
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-interfaces

Tom Lane wrote:
> I think it's probably better to have the default assumption be
> "connection possibly recoverable" than have it be "summarily kill
> connection at first hint of trouble". The latter seems less robust
> not more so.

Not after the connection failure has made its way into a PGresult, surely?
Doesn't seem consistent with the design choice of aborting transactions
on error, for starters. Are you saying that a session is still in usable
shape when you have no way of establishing whether the last command
succeeded?

If you're in an explicit transaction when this happens, it's in an unknown
state[*] so you have to abort anyway. All the client hears is "there's
been an error of some sort, but the connection may or may not be fine,
thank you." You don't necessarily know what level of transaction nesting
you're in though, so you may have to fire off aborts until you're pretty
sure you're out of all of them. Frankly I'd rather call PQreset() just to
save myself the trouble.

(*) Yes, there are cases where the transaction is left in a reliable
state. Such as on a read-only query, which the application could retry
(I'll assume it cares about the results or it wouldn't have queried) at
the cost of greater code complexity. To me is one of the cases where
simplicity and clarity matter a damn sight more than optimizing out the
reconnect on the offchance that the application knows how to handle the
situation despite not receiving even the basic knowledge that the error
was something to do with the connection, not the query. Tom, when I said
way back when that I wanted to do recovery and retry after a connection
was lost, weren't you the one who said "this scares the hell out of me"
because you couldn't be sure whether the last command committed?

I thought the mantra when it came to networking went "don't second-guess
the OS." If you get negative bytes out of a socket, there are a few known
errno values that mean it's a transient thing. Fine, if you identify more
of those then chuck them in there. There are several cases of that in
there already. But otherwise, why not assume that the system gave you an
error code because it decided it saw a failure?

Jeroen

In response to

Responses

Browse pgsql-interfaces by date

  From Date Subject
Next Message Robert Perry 2005-07-06 13:49:27 By Passed Domain Constraints
Previous Message Tom Lane 2005-07-05 14:43:09 Re: libpq and connection failures