Better detection of staled postmaster.pid

From: Pavel Raiskup <praiskup(at)redhat(dot)com>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Better detection of staled postmaster.pid
Date: 2015-08-31 14:12:20
Message-ID: 1711927.hbtYs8Lf7C@nb.usersys.redhat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

This is most likely just a request for brainstorm.

It's been reported [1] that postmaster fails to start against staled
postmaster.pid after (e.g.) power outage on Fedora, its due to init system
parallelism and "some" other newly started process can already have allocated
the same PID as the old postmaster had -- and in this case postmaster refuses
to delete staled pidfile (which is expected as we need to be really
careful).

Don't you see some other possible check we could implement to guarantee that
the PID mentioned in postmaster.pid does not hide concurrent postmaster?
I can think of /proc/<CONCURRENT_PID>/cmdline parsing for possible '-D' option
occurrence, but that is not terribly portable and it could be considered
racy, or? Some acceptable hack we could use to tell to other processes
that we are running particular data directory?

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1257334

Pavel

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kevin Grittner 2015-08-31 14:20:42 Re: Better detection of staled postmaster.pid
Previous Message Kevin Grittner 2015-08-31 14:07:37 snapshot too old, configured by time