Re: backend hangs at immediate shutdown (Re: Back-branch update releases coming in a couple weeks)

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: MauMau <maumau307(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: backend hangs at immediate shutdown (Re: Back-branch update releases coming in a couple weeks)
Date: 2013-02-01 14:04:30
Message-ID: 20130201140430.GC6915@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2013-02-01 08:55:24 -0500, Peter Eisentraut wrote:
> On 1/31/13 5:42 PM, MauMau wrote:
> > Thank you for sharing your experience. So you also considered making
> > postmaster SIGKILL children like me, didn't you? I bet most of people
> > who encounter this problem would feel like that.
> >
> > It is definitely pg_ctl who needs to be prepared, not the users. It may
> > not be easy to find out postgres processes to SIGKILL if multiple
> > instances are running on the same host. Just doing "pkill postgres"
> > will unexpectedly terminate postgres of other instances.
>
> In my case, it was one backend process segfaulting, and then some other
> backend processes didn't respond to the subsequent SIGQUIT sent out by
> the postmaster. So pg_ctl didn't have any part in it.
>
> We ended up addressing that by installing a nagios event handler that
> checked for this situation and cleaned it up.
>
> > I would like to make a patch which that changes SIGQUIT to SIGKILL when
> > postmaster terminates children. Any other better ideas?
>
> That was my idea back then, but there were some concerns about it.
>
> I found an old patch that I had prepared for this, which I have
> attached. YMMV.

> +static void
> +quickdie_alarm_handler(SIGNAL_ARGS)
> +{
> + /*
> + * We got here if ereport() was blocking, so don't go there again
> + * except when really asked for.
> + */
> + elog(DEBUG5, "quickdie aborted by alarm");
> +

Its probably not wise to enter elog.c again, that path might allocate
memory and we wouldn't be any wiser. Unfortunately there's not much
besides a write(2) to stderr that can safely be done...

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2013-02-01 14:07:23 Re: [PATCH] HOT on tables with oid indexes broken
Previous Message Pavel Stehule 2013-02-01 13:59:25 Re: proposal: enable new error fields in plpgsql (9.4)