Re: [HACKERS] possible self-deadlock window after bad ProcessStartupPacket

From: Ashwin Agrawal <aagrawal(at)pivotal(dot)io>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc: Andres Freund <andres(at)anarazel(dot)de>, nico(at)cryptonector(dot)com, Asim Praveen <apraveen(at)pivotal(dot)io>, Jimmy Yih <jyih(at)pivotal(dot)io>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] possible self-deadlock window after bad ProcessStartupPacket
Date: 2018-07-20 21:41:28
Message-ID: CALfoeit8G_CbyUxQu2Tx2m_2vjN1cX6Zxi+8rx6VieGbcoBzeg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jul 20, 2018 at 1:54 AM Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:

> On 19/07/18 23:13, Andres Freund wrote:
> > On 2018-07-19 14:54:52 -0500, Nico Williams wrote:
> >> On Thu, Jul 19, 2018 at 12:20:53PM -0700, Andres Freund wrote:
> >>> On 2018-07-19 11:57:25 +0300, Heikki Linnakangas wrote:
> >>>> Ugh. Yeah, in wal_quickdie, and other aux process *_quickdie()
> handlers, I
> >>>> agree we should just _exit(2). All we want to do is to exit the
> process
> >>>> immediately.
> >>>
> >>> Indeed.
> >>>
> >>>> bgworker_quickdie() makes some effort to block SIGQUIT during the
> exit()
> >>>> processing, but that doesn't solve the whole problem. The process
> could've
> >>>> been in the middle of a malloc/free when the signal arrived, for
> example.
> >>>> exit() is simply not safe to call from a signal handler.
> >>>
> >>> Yea. I really don't understand why we have't been able to agree on this
> >>> for *years* now.
> >>
> >> I mean, only calling async-signal-safe functions from signal handlers is
> >> a critical POSIX requirement, and exit(3) is NOT async-signal-safe.
> >
> > There's a few standard requirements that aren't particularly relevant in
> > practice (including some functions not being listed as signal
> > safe). Just arguing from that isn't really helpful. But there's plenty
> > hard evidence, including a few messages upthread!, of it being
> > practically problematic. Just looking at the reasoning at why exit (and
> > malloc) aren't signal safe however, makes it clear that this particular
> > restriction should be adhered to, however.
>
> ISTM that no-one has any great ideas on what to do about the ereport()
> in quickdie(). But I think we have consensus on replacing the exit(2)
> calls with _exit(2). If we do just that, it would be better than the
> status quo, even if it doesn't completely fix the problem. This would
> prevent the case that Asim reported, for starters.
>
> Any objections to the attached?
>
> With _exit(), I think we wouldn't really need the
> "PG_SETMASK(&BlockSig);" calls in the aux process *_quickdie handlers
> that don't do anything else than call _exit(). But I didn't dare to
> remove them yet.
>

Seems there is opportunity to actuall consolidate all those quickdies as
well. As I see no difference between the auxiliary process quickdie
implementations. Only backend `quickdie()` has special code for ereport and
all.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jerry Jelinek 2018-07-20 21:50:37 Re: patch to allow disable of WAL recycling
Previous Message Dmitry Dolgov 2018-07-20 21:32:34 Re: [HACKERS] [PATCH] Generic type subscripting