Re: Escaping from blocked send() reprised.

From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: <robertmhaas(at)gmail(dot)com>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Escaping from blocked send() reprised.
Date: 2014-09-26 18:02:16
Message-ID: 5425AA28.60308@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 09/09/2014 01:31 PM, Kyotaro HORIGUCHI wrote:
> Hi, I added and edited some comments.
>
>> Sorry, It tha patch contains a silly bug. Please find the
>> attatched one.

I must say this scares the heck out of me. The current code goes through
some trouble to not throw an error while in a recv() send(). For
example, you removed the DoingCommandRead check from
prepare_for_client_read(). There's an comment in postgres.c that says this:

> /*
> * (2) Allow asynchronous signals to be executed immediately if they
> * come in while we are waiting for client input. (This must be
> * conditional since we don't want, say, reads on behalf of COPY FROM
> * STDIN doing the same thing.)
> */
> DoingCommandRead = true;

With the patch, we do allow asynchronous signals to be processed while
blocked in COPY FROM STDIN. Maybe that's OK, but I don't feel
comfortable just changing it. (the comment is now wrong, of course)

This patch also enables processing query cancel signals while blocked,
not just SIGTERM. That's not good; we might be in the middle of sending
a message, and we cannot just error out of that or we might violate the
fe/be protocol. That's OK with a SIGTERM as you're terminating the
connection anyway, and we have the PqCommBusy safeguard in place that
prevents us from sending broken messages to the client, but that's not
good enough if we wanted to keep the backend alive, as we won't be able
to send anything to the client anymore.

BTW, we've been talking about blocking in send(), but this patch also
let's a recv() in e.g. COPY FROM STDIN to be interrupted. That's
probably a good thing; surely you have exactly the same issues with that
as with send(). But I didn't realize we had a problem with that too.

In summary, this patch is not ready as it is, but I think we can fix it.
The key question is: is it safe to handle SIGTERM in the signal handler,
calling the exit-handlers and exiting the backend, when blocked in a
recv() or send()? It's a change in the pqcomm.c API; most pqcomm.c
functions have not thrown errors or processed interrupts before. But
looking at the callers, I think it's safe, and there isn't actually any
comments explicitly saying that pqcomm.c will never throw errors.

I propose the attached patch. It adds a new flag ImmediateDieOK, which
is a weaker form of ImmediateInterruptOK that only allows handling a
pending die-signal in the signal handler.

Robert, others, do you see a problem with this?

Over IM, Robert pointed out that it's not safe to jump out of a signal
handler with siglongjmp, when we're inside library calls, like in a
callback called by OpenSSL. But even with current master branch, that's
exactly what we do. In secure_raw_read(), we set ImmediateInterruptOK =
true, which means that any incoming signal will be handled directly in
the signal handler, which can mean elog(ERROR). Should we be worried?
OpenSSL might get confused if control never returns to the SSL_read() or
SSL_write() function that called secure_raw_read().

- Heikki

Attachment Content-Type Size
ImmediateDieOk-1.patch text/x-diff 5.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2014-09-26 18:02:23 Re: proposal: rounding up time value less than its unit.
Previous Message David Johnston 2014-09-26 17:43:45 Re: proposal: rounding up time value less than its unit.