Coping with backend crash in libpq

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgreSQL(dot)org, pgsql-interfaces(at)postgreSQL(dot)org
Subject: Coping with backend crash in libpq
Date: 1998-07-28 17:23:35
Message-ID: 4870.901646615@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-interfaces

I've just noticed that libpq doesn't cope very gracefully if the backend
exits when not in the middle of a query (ie, because the postmaster told
it to quit after some other BE crashed). The behavior in psql, for
example, is that the next time you issue a query, psql just exits
without printing anything at all. This is Not Friendly, especially
considering that the BE sent a nice little notice message before it quit.

The main problem is that if the next thing you do is to send a new query,
send() sees that the connection has been closed and generates a SIGPIPE
signal. By default that terminates the frontend process.

We could cure this by having libpq disable SIGPIPE, but we would have
to disable it before each send() and re-enable afterwards to avoid
affecting the behavior of the rest of the frontend application.
Two additional kernel calls per query sounds like a lot of overhead.
(We do actually do this when trying to close the connection, but not
during normal queries.)

Perhaps a better answer is to have PQsendQuery check for fresh input
from the backend before trying to send the query. This would have two
side effects:
1. If a NOTICE message has arrived, we could print it.
2. If EOF is detected, we will reset the connection state to
CONNECTION_BAD, which PQsendQuery can use to avoid trying to send.

The minimum cost to do this is one kernel call (a select(), which
unfortunately is probably a fairly expensive call) in the normal
case where no new input has arrived. Another objection is that it's
not 100% bulletproof --- if the backend closes the connection in the
window between select() and send() then you can still get SIGPIPE'd.
The odds of this seem pretty small however.

I'm inclined to go with answer #2, because it seems to have less
of a performance impact, and it will ensure that the backend's polite
"The Postmaster has informed me that some other backend died abnormally
and possibly corrupted shared memory." message gets displayed. With
approach #1 we'd still have to go through some pushups to get the
notice to come out.

Does anyone have an objection, or a better idea?

regards, tom lane

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Karl Denninger 1998-07-28 17:44:59 Re: [INTERFACES] Coping with backend crash in libpq
Previous Message Tom Lane 1998-07-28 14:42:26 Re: [HACKERS] 6.1 pg_dump core dump

Browse pgsql-interfaces by date

  From Date Subject
Next Message Karl Denninger 1998-07-28 17:44:59 Re: [INTERFACES] Coping with backend crash in libpq
Previous Message Byron Nikolaidis 1998-07-28 17:06:22 Vacation