libpq drops error messages received just before backend crash

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgreSQL(dot)org
Subject: libpq drops error messages received just before backend crash
Date: 1999-08-16 02:56:30
Message-ID: 23527.934772190@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

While poking at the vacuum-induced coredump we were discussing on
Friday, I noticed that psql did not report
ERROR: vacuum: can't destroy lock file!
even though this message was showing up in the postmaster log.
Even more interesting, psql *did* report the
NOTICE: AbortTransaction and not in in-progress state
that the backend emitted *after* the elog(ERROR) and just before
coredumping.

It turns out that this is a libpq deficiency: it's got the error
message, but because PQexec() was used, it's waiting around for
a 'Z' ReadyForQuery message before it hands the error message
back to the application. Since the backend crashes, of course
the 'Z' never comes ... and when libpq detects closure of the
connection, it wipes out the stored error message in its haste
to report
pqReadData() -- backend closed the channel unexpectedly.
This probably means the backend terminated abnormally
before or while processing the request.
which is all that the user gets to see, unless he thinks to
look in the postmaster log. Boo hiss.

(The reason the NOTICE shows up is that it's just dumped to stderr
immediately upon receipt, rather than being queued to hand back
to the application.)

I have a fix in mind for this: concatenate "backend closed the channel"
to the waiting error message, instead of wiping it out. But I think
I will wait till after Michael Ansley's long-query changes have been
committed before I start hacking on libpq again.

Anyway, if you want to know what really happened right before a
backend crash, you should look in the postmaster log until this
is fixed...

regards, tom lane

Browse pgsql-hackers by date

  From Date Subject
Next Message The Hermit Hacker 1999-08-16 02:57:11 Re: [HACKERS] Re: autoconf does not recognize HP-UX 11.00.
Previous Message The Hermit Hacker 1999-08-16 02:55:27 Re: [HACKERS] New man pages