Re: backend hangs at immediate shutdown

From: "MauMau" <maumau307(at)gmail(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Andres Freund" <andres(at)2ndquadrant(dot)com>
Cc: "Tatsuo Ishii" <ishii(at)postgresql(dot)org>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: backend hangs at immediate shutdown
Date: 2013-01-31 12:40:39
Message-ID: 611D451C5FB14A86889EA631B1D5B885@maumau
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

From: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
> "MauMau" <maumau307(at)gmail(dot)com> writes:
>> How about the case where some backend crashes due to a bug of PostgreSQL?
>> In this case, postmaster sends SIGQUIT to all backends, too. The
>> instance
>> is expected to disappear cleanly and quickly. Doesn't the hanging
>> backend
>> harm the restart of the instance?
>
> [ shrug... ] That isn't guaranteed, and never has been --- for
> instance, the process might have SIGQUIT blocked, perhaps as a result
> of third-party code we have no control over.

Are you concerned about user-defined C functions? I don't think they need
to block signals. So I don't find it too restrictive to say "do not block
or send signals in user-defined functions." If it's a real concern, it
should be noted in the manul, rather than writing "do not use pg_ctl
stop -mi as much as you can, because it can leave hanging backends."

>> How about using SIGKILL instead of SIGQUIT?
>
> Because then we couldn't notify clients at all. One practical
> disadvantage of that is that it would become quite hard to tell from
> the outside which client session actually crashed, which is frequently
> useful to know.

How is the message below useful to determine which client session actually
crashed? The message doesn't contain information about the crashed session.
Are you talking about log_line_prefix?

ERROR: terminating connection because of crash of another server process
DETAIL: The postmaster has commanded this server process to roll back the
current transaction and exit, because another server process exited
abnormally and possibly corrupted shared memory.
HINT: In a moment you should be able to reconnect to the database and
repeat your command.

However, it is not quickdie() but LogChildExit() that emits useful
information to tell which session crashed. So I don't think quickdie()'s
message is very helpful.

> I think if we want to make it bulletproof we'd have to do what the
> OP suggested and switch to SIGKILL. I'm not enamored of that for the
> reasons I mentioned --- but one idea that might dodge the disadvantages
> is to have the postmaster wait a few seconds and then SIGKILL any
> backends that hadn't exited.

I believe that SIGKILL is the only and simple way to choose. Consider
again: the purpose of "pg_ctl stop -mi" is to immediately and reliably shut
down the instance. If it is not reliable, what can we do instead?

Regards
MauMau

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2013-01-31 13:14:25 Re: Performance Improvement by reducing WAL for Update Operation
Previous Message Andres Freund 2013-01-31 12:37:35 [PATCH] HOT on tables with oid indexes broken