Re: libpq and psql not on same page about SIGPIPE

From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org, manfred(at)colorfullife(dot)com
Subject: Re: libpq and psql not on same page about SIGPIPE
Date: 2004-12-01 04:29:56
Message-ID: 200412010429.iB14Tul07030@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

Tom Lane wrote:
> libpq compiled with --enable-thread-safety thinks it can set the SIGPIPE
> signal handler. It thinks once is enough.
>
> psql thinks it can arbitrarily flip the signal handler between SIG_IGN
> and SIG_DFL. Ergo, after the first use of the pager for output, libpq's
> SIGPIPE handling will be broken.
>
> I submit that psql is unlikely to be the only program that does this,
> and therefore that libpq must be considered broken, not psql.

I have researched possible fixes for our threading sigpipe handling in
libpq. Basically, we need to ignore SIGPIPE in socket send() (and
SSL_write) because if the backend dies unexpectedly, the process will
die. libpq would rather trap the failure.

In 7.4.X we set ignore for SIGPIPE before write and reset it after
write, but that doesn't work for threading because it affects all
threads, not just the thread using libpq.

Our current setup is wrong because an application could change SIGPIPE
for its own purposes (like psql does) and remove our custom thread
handler for sigpipe.

The best solution seems to be one suggested by Manfred in November of
2003:

> signal handlers are a process property, not a thread property - that
> code is broken for multi-threaded apps.
> At least that's how I understand the opengroup man page, and a quick
> google confirmed that:
> http://groups.google.de/groups?selm=353662BF.9D70F63A%40brighttiger.com
>
> I haven't found a reliable thread-safe approach yet:
> My first idea was block with pthread_sigmask, after send check if
> pending with sigpending, and then delete with sigwait, and restore
> blocked state. But that breaks if SIGPIPE is blocked and a signal is
> already pending: there is no way to remove our additional SIGPIPE. I
> don't see how we can avoid destroying the realtime signal info.

His idea of pthread_sigmask/send/sigpending/sigwait/restore-mask. Seems
we could also check errno for SIGPIPE rather than calling sigpending.

He has a concern about an application that already blocked SIGPIPE and
has a pending SIGPIPE signal waiting already. One idea would be to
check for sigpending() before the send() and clear the signal only if
SIGPIPE wasn't pending before the call. I realize that if our send()
also generates a SIGPIPE it would remove the previous realtime signal
info but that seems a minor problem.

Comments? This seems like our only solution.

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jan Wieck 2004-12-01 05:08:36 Re: [ANNOUNCE] USENET vs Mailing Lists Poll ...
Previous Message Neil Conway 2004-12-01 04:29:47 Re: nodeAgg perf tweak

Browse pgsql-patches by date

  From Date Subject
Next Message Manfred Spraul 2004-12-01 06:52:42 Re: libpq and psql not on same page about SIGPIPE
Previous Message Zhenbang Wei 2004-12-01 03:34:26 Update traditional chinese translations for 8.0