Re: Shutdown fails with both 'fast' and 'immediate'

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: David Schnur <dnschnur(at)gmail(dot)com>
Cc: pgsql-admin(at)postgresql(dot)org
Subject: Re: Shutdown fails with both 'fast' and 'immediate'
Date: 2010-05-12 17:16:13
Message-ID: 22516.1273684573@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

David Schnur <dnschnur(at)gmail(dot)com> writes:
> I'm less concerned with the particular query than with the general question
> of when a shutdown could hang like this. I expected this to be possible
> when using -m fast, but my understanding was that -m immediate really forced
> termination.

Yeah, it's supposed to. The sequence is pg_ctl -m immediate sends
SIGQUIT to the postmaster, which in turn sends SIGQUIT to all its child
processes, and their SIGQUIT interrupt handlers just immediately exit().
I was thinking earlier that there might be a bug in the postmaster state
machine that prevented it from sending SIGQUIT if it had already
received SIGTERM (-m fast), but a look at the sources doesn't support
that theory. The only obvious theory at this point is that the backend
is stuck in some uninterruptable kernel call, but it's hard to imagine
what.

Is the postmaster still there after -m immediate, or does it quit?
If it's still there, maybe there's some problem in the earlier part
of the sequence.

A gdb stack trace from whichever processes are still there after -m
immediate could be informative. Another thing you could try is a
manual "kill -QUIT pid" on the uncooperative backend(s).

regards, tom lane

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Donald Fraser 2010-05-12 20:03:44 Re: Shutdown fails with both 'fast' and 'immediate'
Previous Message David Schnur 2010-05-12 17:03:57 Re: Shutdown fails with both 'fast' and 'immediate'