Re: backend hangs at immediate shutdown (Re: Back-branch update releases coming in a couple weeks)

From: "MauMau" <maumau307(at)gmail(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Andres Freund" <andres(at)2ndquadrant(dot)com>
Cc: <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: backend hangs at immediate shutdown (Re: Back-branch update releases coming in a couple weeks)
Date: 2013-01-30 22:41:23
Message-ID: 30376B82976C4AC1A306B782CA378FEA@maumau
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

From: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
> "MauMau" <maumau307(at)gmail(dot)com> writes:
>> I think the solution is the typical one. That is, to just remember the
>> receipt of SIGQUIT by setting a global variable and call siglongjmp() in
>> quickdie(), and perform tasks currently done in quickdie() when
>> sigsetjmp()
>> returns in PostgresMain().
>
> I think this cure is considerably worse than the disease. As stated,
> it's not a fix at all: longjmp'ing out of a signal handler is no better
> defined than what happens now, in fact it's probably even less safe.
> We could just set a flag and wait for the mainline code to notice,
> but that would make SIGQUIT hardly any stronger than SIGTERM --- in
> particular it couldn't get you out of any loop that wasn't checking for
> interrupts.

Oh, I was careless. You are right, my suggestion is not a fix at all
because free() would continue to hold some lock after siglongjmp(), which
malloc() tries to acquire.

> The long and the short of it is that SIGQUIT is the emergency-stop panic
> button. You don't use it for routine shutdowns --- you use it when
> there is a damn good reason to and you're prepared to do some manual
> cleanup if necessary.
>
> http://en.wikipedia.org/wiki/Big_Red_Switch

How about the case where some backend crashes due to a bug of PostgreSQL?
In this case, postmaster sends SIGQUIT to all backends, too. The instance
is expected to disappear cleanly and quickly. Doesn't the hanging backend
harm the restart of the instance?

How about using SIGKILL instead of SIGQUIT? The purpose of SIGQUIT is to
shutdown the processes quickly. SIGKILL is the best signal for that
purpose. The WARNING message would not be sent to clients, but that does
not justify the inability of immediately shutting down.

Regards
MauMau

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jim Nasby 2013-01-30 22:44:34 Re: autovacuum not prioritising for-wraparound tables
Previous Message Christopher Browne 2013-01-30 22:37:33 Re: autovacuum not prioritising for-wraparound tables