Re: [HACKERS] possible self-deadlock window after bad ProcessStartupPacket

From: Nico Williams <nico(at)cryptonector(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Asim R P <apraveen(at)pivotal(dot)io>, Jimmy Yih <jyih(at)pivotal(dot)io>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] possible self-deadlock window after bad ProcessStartupPacket
Date: 2018-07-19 20:01:44
Message-ID: 20180719200143.GJ9712@localhost
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jul 19, 2018 at 03:49:35PM -0400, Tom Lane wrote:
> Andres Freund <andres(at)anarazel(dot)de> writes:
> > On 2018-07-19 11:57:25 +0300, Heikki Linnakangas wrote:
> >> The regular backend's quickdie() function is more tricky. It should also
> >> call _exit(2) rather than exit(2). But it also tries to ereport a WARNING,
> >> and that is quite useful.
>
> There's already an on_exit_reset in there; why do we need more than that?

You're not allowed to call exit(3) in signal handlers. exit(3) is not
async-signal-safe (for example, it calls atexit handlers, flushes stdio,
and who knows what else).

> > Is that actually true? Clients like libpq create the same error message
> > (which has its own issues, because it'll sometimes mis-interpret
> > things). The message doesn't actually have useful content, no?
>
> Yes, it does: it lets users tell the difference between exit due to a
> SIGQUIT and a crash of their own backend.

ereport() calls (via errstart() and write_stderr()) vfprintf() and
fflush(stderr). That's absolutely not async-signal-safe. The WIN32
code in write_stderr() does use vsnprintf() instead, and that could get
written to stderr with write(2), which is async-signal-safe.

> Admittedly, if we crash trying to send the message, then we're not
> better off. But since that happens only very rarely, I do not think
> it's a reasonable tradeoff to never send it at all.

If sending it means using OpenSSL or what have you, that's probably even
more async-signal-safe code.

Now, ereport() could all be made safe enough for use in signal handlers,
but it isn't at this time.

What I'd do is have a volatile sig_atomic_t in_signal_handler_context
variable to indicate that we're dying, and then when that is non-zero,
ereport() and friends could use all-async-signal-safe codepaths.

Nico
--

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2018-07-19 20:04:01 Re: [HACKERS] possible self-deadlock window after bad ProcessStartupPacket
Previous Message Tom Lane 2018-07-19 19:56:41 Re: Possible performance regression in version 10.1 with pgbench read-write tests.