Re: Immediate shutdown and system(3)

From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Simon Riggs <simon(at)2ndquadrant(dot)com>
Subject: Re: Immediate shutdown and system(3)
Date: 2009-03-02 06:08:57
Message-ID: 3f0b79eb0903012208g6d427cear275df57524d14e17@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On Fri, Feb 27, 2009 at 6:52 PM, Heikki Linnakangas
<heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
> We're using SIGQUIT to signal immediate shutdown request. Upon receiving
> SIGQUIT, postmaster in turn kills all the child processes with SIGQUIT and
> exits.
>
> This is a problem when child processes use system(3) to call other programs.
> We use system(3) in two places: to execute archive_command and
> restore_command. Fujii Masao identified this with pg_standby back in
> November:
>
> http://archives.postgresql.org/message-id/3f0b79eb0811280156s78a3730en73aca49b6e95d3cb@mail.gmail.com
> and recently discussed here
> http://archives.postgresql.org/message-id/3f0b79eb0902260919l2675aaafq10e5b2d49ebfa3a1@mail.gmail.com
>
> I'm starting a new thread to bring this to attention of those who haven't
> been following the hot standby stuff. pg_standby has a particular problem
> because it traps SIGQUIT to mean "end recovery, promote standby to master",
> which it shouldn't do IMHO. But ignoring that for a moment, the problem is
> generic.
>
> SIGQUIT by default dumps core. That's not what we want to happen on
> immediate shutdown. All PostgreSQL processes trap SIGQUIT to exit
> immediately instead, but external commands will dump core. system(3) ignores
> SIGQUIT, so we can't trap it in the parent process; it is always relayed to
> the child.
>
> There's a few options on how to fix that:
>
> 1. Implement a custom version of system(3) using fork+exec that let's us
> trap SIGQUIT and send e.g SIGTERM or SIGINT to the child instead. It might
> be a bit tricky to get this right in a portable way; Windows would certainly
> need a completely separate implementation.
>
> 2. Use a signal other than SIGQUIT for immediate shutdown of child
> processes. We can't change the signal sent to postmaster for
> backwards-compatibility reasons, but the signal sent by postmaster to child
> processes we could change. We've already used all signals in normal
> backends, but perhaps we could rearrange them.
>
> 3. Use SIGINT instead of SIGQUIT for immediate shutdown of the two child
> processes that use system(3): the archiver process and the startup process.
> Neither of them use SIGINT currently. SIGINT is ignored by system(3), like
> SIGQUIT, but the default action is to terminate the process rather than core
> dump. Unfortunately pg_standby traps SIGINT too to mean "promote to master",
> but we could change it to use SIGUSR1 instead for that purpose. If someone
> has a script that uses "killall -INT pg_standby" to promote a standby server
> to master, it would need to be changed. Looking at the manual page of
> pg_standby, however, it seems that the kill-method of triggering a promotion
> isn't documented, so with a notice in release notes we could do that.
>
> I'm leaning towards option 3, but I wonder if anyone sees a better solution.

4. Use the shared memory to tell the startup process about the shutdown state.
When a shutdown signal arrives, postmaster sets the corresponding shutdown
state to the shared memory before signaling to the child processes. The startup
process check the shutdown state whenever executing system(), and determine
how to exit according to that state. This solution doesn't change any existing
behavior of pg_standby. What is your opinion?

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2009-03-02 06:24:21 Re: WIP: named and mixed notation support
Previous Message Sushant Sinha 2009-03-02 02:59:31 Re: patch for space around the FragmentDelimiter