Re: 9.2 recovery/startup problems

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: 9.2 recovery/startup problems
Date: 2014-12-02 16:54:12
Message-ID: CAMkU=1zAern2uby+fYveXrO-HY3cfS_uyv8SmBCMNipBXSOiUg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Dec 2, 2014 at 7:41 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> On Wed, Nov 26, 2014 at 7:13 PM, Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:
> > If I do a pg_ctl stop -mf, then both files go away. If I do a pg_ctl
> stop
> > -mi, then neither goes away. It is only with the /sbin/reboot that I get
> > the fatal combination of _init being gone but the other still present.
>
> Eh? That sounds wonky.
>
> I mean, reboot normally kills processes with SIGTERM or SIGKILL, in
> which case I'd expect the outcome to match what you get with pg_ctl
> stop -mf or pg_ctl stop -mi. The only way I can see that you'd get a
> different behavior is if you did a hard reboot (like echo b >
> /proc/sysrq-trigger); if that changes things, then we might have a
> missing-fsync bug. How is that reboot managing to leave the main fork
> behind while losing the init fork?
>

During abort processing after getting a SIGTERM, the back end truncates
59288 to zero size, and unlinks all the other files
(including 59288_init). The actual removal of 59288 is left until the
checkpoint. So if you SIGTERM the backend, then take down the server
uncleanly before the next checkpoint completes, you are left with just
59288.

Here is the strace:

open("base/16416/59288", O_RDWR) = 8
ftruncate(8, 0) = 0
close(8) = 0
unlink("base/16416/59288.1") = -1 ENOENT (No such file or
directory)
unlink("base/16416/59288_fsm") = -1 ENOENT (No such file or
directory)
unlink("base/16416/59288_vm") = -1 ENOENT (No such file or
directory)
unlink("base/16416/59288_init") = 0
unlink("base/16416/59288_init.1") = -1 ENOENT (No such file or
directory)

Cheers,

Jeff

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2014-12-02 16:58:00 Re: why is PG_AUTOCONF_FILENAME is pg_config_manual.h?
Previous Message Robert Haas 2014-12-02 16:51:35 Re: [PATCH] HINT: pg_hba.conf changed since last config reload