Re: SIGQUIT on archiver child processes maybe not such a hot idea?

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: "Tsunakawa, Takayuki" <tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com>, 'Tom Lane' <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: SIGQUIT on archiver child processes maybe not such a hot idea?
Date: 2019-09-03 19:43:37
Message-ID: 20190903194337.GV16436@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Greetings,

* Michael Paquier (michael(at)paquier(dot)xyz) wrote:
> On Mon, Sep 02, 2019 at 12:27:09AM +0000, Tsunakawa, Takayuki wrote:
> > From: Tom Lane [mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us]
> >> After investigation, the mechanism that's causing that is that the
> >> src/test/recovery/t/010_logical_decoding_timelines.pl test shuts
> >> down its replica server with a mode-immediate stop, which causes
> >> that postmaster to shut down all its children with SIGQUIT, and
> >> in particular that signal propagates to a "cp" command that the
> >> archiver process is executing. The "cp" is unsurprisingly running
> >> with default SIGQUIT handling, which per the signal man page
> >> includes dumping core.
> >
> > We've experienced this (core dump in the data directory by an
> > archive command) years ago. Related to this, the example of using
> > cp in the PostgreSQL manual is misleading, because cp doesn't
> > reliably persist the WAL archive file.
>
> The previous talks about having pg_copy are still where they were a
> couple of years ago as we did not agree on which semantics it should
> have. If we could move forward with that and update the documentation
> from its insanity that would be great and... The signal handling is
> something else we could customize in a more favorable way with the
> archiver. Anyway, switching from something else than SIGQUIT to stop
> the archiver will not prevent any other tools from generating core
> dumps with this other signal.

Any tools being used for archive command (which should basically be
things designed to be used as such and certainly not cp...) should be
prepared to handle what PG ends up doing here. I don't think we should
change to a different signal because it'll make 'cp' do something
different. If there's a good reason to use a different signal, great.

In other words, I think I agree with Tom that maybe we should be using
SIGINT here, but not so much because of exactly what cp does but rather
because that's a more appropriate signal, as shown by what the default
handling for those signals is.

Thanks,

Stephen

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2019-09-03 19:48:42 Re: SIGQUIT on archiver child processes maybe not such a hot idea?
Previous Message Stephen Frost 2019-09-03 19:25:31 Re: add a MAC check for TRUNCATE