Re: [HACKERS] possible self-deadlock window after bad ProcessStartupPacket

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: Asim R P <apraveen(at)pivotal(dot)io>, Andres Freund <andres(at)anarazel(dot)de>
Cc: Jimmy Yih <jyih(at)pivotal(dot)io>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] possible self-deadlock window after bad ProcessStartupPacket
Date: 2018-07-19 08:57:25
Message-ID: b3c6f95f-b130-c814-92b7-878f2ab0b65d@iki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 19/07/18 03:26, Asim R P wrote:
> On Thu, Jun 22, 2017 at 10:50 AM, Andres Freund <andres(at)anarazel(dot)de> wrote:
>>
>> Or, probably more robust: Simply _exit(2) without further ado, and rely
>> on postmaster to output an appropriate error message. Arguably it's not
>> actually useful to see hundreds of "WARNING: terminating connection because of
>> crash of another server process" messages in the log anyway.
>>
>
> To support using _exit(2) in *quickdie() handlers, I would like to
> share another stack trace indicating self-deadlock. In this case, WAL
> writer process got SIGQUIT while it was already handling a SIGQUIT,
> leading to self-deadlock.

Ugh. Yeah, in wal_quickdie, and other aux process *_quickdie() handlers,
I agree we should just _exit(2). All we want to do is to exit the
process immediately.

bgworker_quickdie() makes some effort to block SIGQUIT during the exit()
processing, but that doesn't solve the whole problem. The process
could've been in the middle of a malloc/free when the signal arrived,
for example. exit() is simply not safe to call from a signal handler.

The regular backend's quickdie() function is more tricky. It should also
call _exit(2) rather than exit(2). But it also tries to ereport a
WARNING, and that is quite useful. ereport() is generally not safe in a
signal handler. There is some protection for specific failures: see
socket_putmessage(), for example, which has a mechanism to prevent a
message from being sent to the client in the middle of another message.
But at the end of the day, ereport() is complicated enough that it will
surely do a lot of unsafe things.

I don't have any great ideas on how to make the ereport(WARNING) safer,
but I agree we should at least change all the exit(2) calls in quickdie
handlers to _exit(2).

BTW, if postmaster is still alive, it will send a SIGKILL to any child
process that doesn't terminate in 5 seconds. So a deadlock in SIGQUIT
handler isn't that bad. Some other nasty things could happen, however;
all bets are off if you call unsafe functions from a signal handler.

- Heikki

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2018-07-19 09:14:17 Re: MAP syntax for arrays
Previous Message Nikhil Sontakke 2018-07-19 08:55:19 Re: [HACKERS] logical decoding of two-phase transactions