Re: quickdie doing memory allocations (was atomic pin/unpin causing errors)

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Teodor Sigaev <teodor(at)sigaev(dot)ru>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: quickdie doing memory allocations (was atomic pin/unpin causing errors)
Date: 2016-05-05 20:23:02
Message-ID: 20160505202302.khv7zpe5p234zf4b@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2016-05-05 15:56:45 -0400, Tom Lane wrote:
> Andres Freund <andres(at)anarazel(dot)de> writes:
> >> #0 0x00000008014321d7 in sbrk () from /lib/libc.so.7
> >> #1 0x0000000801431ddd in sbrk () from /lib/libc.so.7
> >> #2 0x000000080142e5bb in sbrk () from /lib/libc.so.7
> >> #3 0x000000080142e085 in sbrk () from /lib/libc.so.7
> >> #4 0x000000080142de28 in sbrk () from /lib/libc.so.7
> >> #5 0x000000080142e1cf in sbrk () from /lib/libc.so.7
> >> #6 0x0000000801439815 in free () from /lib/libc.so.7
> >> #7 0x000000080149e3d6 in nsdispatch () from /lib/libc.so.7
> >> #8 0x00000008014a41c6 in __cxa_finalize () from /lib/libc.so.7
> >> #9 0x000000080144525c in exit () from /lib/libc.so.7
> >> #10 0x00000000008e1bc2 in quickdie (postgres_signal_arg=3) at postgres.c:2623
> >> #11 <signal handler called>
> >> #12 0x0000000801431847 in sbrk () from /lib/libc.so.7
>
> > That looks like independent issue, namely that we're trigger memory
> > allocations from a signal handler (see frames 12, 11, 10, 9). Presumably
> > due to system registered atexit handlers. I suspect we should be using
> > _exit() here? Tom?
>
> I don't think that would improve matters. In the first place, if we use
> _exit() here that might encourage third-party extension authors to believe
> they should use _exit(), which would be bad.

The sourcetree already has a number of _exit() calls, so I don't think
that'd make a meaningfull difference.

> In the second place, we don't know what it is we're skipping by not
> running atexit handlers, and again that could be bad.

I've a hard time coming up with a scenario where that'd be a problem in
a PANIC case. Isn't it pretty common to use _exit after fatal errors
(and forks)?

> In the third place, by the time we
> get to the exit() call we've already exposed ourselves to a whole lot of
> such hazards by running ereport() (including sending a message to the
> client!).

True. And that's not good. But the magic of ErrorContext shields us from
a fair amount of issues.

> In the fourth place, if we've received a quickdie interrupt,
> it doesn't actually matter if the process crashes; we just want it to
> quit ASAP.

If it always were crashing, that'd be somewhat fine. But sbrk internally
uses mutexes, so this can result in processes getting stuck. And that is
a problem. There've actually been reports about that every now and then.

Greetings,

Andres Freund

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2016-05-05 20:25:38 Re: Initial release notes created for 9.6
Previous Message Andreas Seltenreich 2016-05-05 20:11:12 Re: [sqlsmith] Failed assertion in BecomeLockGroupLeader