Skip site navigation (1) Skip section navigation (2)

Re: backend hangs at immediate shutdown

From: "MauMau" <maumau307(at)gmail(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>,"Andres Freund" <andres(at)2ndquadrant(dot)com>
Cc: "Tatsuo Ishii" <ishii(at)postgresql(dot)org>,<pgsql-hackers(at)postgresql(dot)org>
Subject: Re: backend hangs at immediate shutdown
Date: 2013-01-31 12:40:39
Message-ID: 611D451C5FB14A86889EA631B1D5B885@maumau (view raw or flat)
Thread:
Lists: pgsql-hackers
From: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
> "MauMau" <maumau307(at)gmail(dot)com> writes:
>> How about the case where some backend crashes due to a bug of PostgreSQL?
>> In this case, postmaster sends SIGQUIT to all backends, too.  The 
>> instance
>> is expected to disappear cleanly and quickly.  Doesn't the hanging 
>> backend
>> harm the restart of the instance?
>
> [ shrug... ]  That isn't guaranteed, and never has been --- for
> instance, the process might have SIGQUIT blocked, perhaps as a result
> of third-party code we have no control over.

Are you concerned about user-defined C functions?  I don't think they need 
to block signals.  So I don't find it too restrictive to say "do not block 
or send signals in user-defined functions."  If it's a real concern, it 
should be noted in the manul, rather than writing "do not use pg_ctl 
stop -mi as much as you can, because it can leave hanging backends."

>> How about using SIGKILL instead of SIGQUIT?
>
> Because then we couldn't notify clients at all.  One practical
> disadvantage of that is that it would become quite hard to tell from
> the outside which client session actually crashed, which is frequently
> useful to know.

How is the message below useful to determine which client session actually 
crashed?  The message doesn't contain information about the crashed session. 
Are you talking about log_line_prefix?

ERROR:  terminating connection because of crash of another server process
DETAIL:  The postmaster has commanded this server process to roll back the 
current transaction and exit, because another server process exited 
abnormally and possibly corrupted shared memory.
HINT:  In a moment you should be able to reconnect to the database and 
repeat your command.

However, it is not quickdie() but LogChildExit() that emits useful 
information to tell which session crashed.  So I don't think quickdie()'s 
message is very helpful.


> I think if we want to make it bulletproof we'd have to do what the
> OP suggested and switch to SIGKILL.  I'm not enamored of that for the
> reasons I mentioned --- but one idea that might dodge the disadvantages
> is to have the postmaster wait a few seconds and then SIGKILL any
> backends that hadn't exited.

I believe that SIGKILL is the only and simple way to choose.  Consider 
again: the purpose of "pg_ctl stop -mi" is to immediately and reliably shut 
down the instance.  If it is not reliable, what can we do instead?


Regards
MauMau



In response to

pgsql-hackers by date

Next:From: Amit KapilaDate: 2013-01-31 13:14:25
Subject: Re: Performance Improvement by reducing WAL for Update Operation
Previous:From: Andres FreundDate: 2013-01-31 12:37:35
Subject: [PATCH] HOT on tables with oid indexes broken

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group