Re: possible self-deadlock window after bad ProcessStartupPacket

From: Andres Freund <andres(at)anarazel(dot)de>
To: Jimmy Yih <jyih(at)pivotal(dot)io>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: possible self-deadlock window after bad ProcessStartupPacket
Date: 2017-06-22 17:50:31
Message-ID: 20170622175031.thq4rl2ygzvk522z@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2017-06-22 10:41:41 -0700, Andres Freund wrote:
> On 2017-02-02 12:18:22 -0800, Jimmy Yih wrote:
> > In the above pull request, Heikki also mentions that a similar scenario can
> > happen during palloc() as well... which is similar to what we saw in
> > Greenplum a couple years back for a deadlock in a malloc() call where we
> > responded by changing exit() to _exit() in quickdie as a fix. That could
> > possibly be applicable to latest Postgres as well.
>
> Isn't the quickdie() issue that we palloc/malloc in the first place,
> rather than the exit call? There's some risk for exit() too, but we
> reset our own atexit handlers before exiting, so the risk seems fairly
> small.
>
>
> In my opinion the fix here would be to do it right and never emit error
> messages from signal handlers via elog.c - we've progressed a lot
> towards the goal and do a lot less in signal handlers these days - but
> quickdie() is one glaring exception to that. I think a reasonable fix
> here would be to use write() of a statically allocated message, rather
> then elog proper, and not to send the message to the client. Libpq, and
> I presume other clients, synthethize a message upon unexpected socket
> closure anyway, and it's completely unclear whether we can send a
> message anyway.

Or, probably more robust: Simply _exit(2) without further ado, and rely
on postmaster to output an appropriate error message. Arguably it's not
actually useful to see hundreds of "WARNING: terminating connection because of
crash of another server process" messages in the log anyway.

- Andres

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2017-06-22 17:52:54 Re: Autovacuum launcher occurs error when cancelled by SIGINT
Previous Message Robert Haas 2017-06-22 17:45:34 Re: shift_sjis_2004 related autority files are remaining