Re: Better detection of staled postmaster.pid

From: Kevin Grittner <kgrittn(at)ymail(dot)com>
To: Pavel Raiskup <praiskup(at)redhat(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Better detection of staled postmaster.pid
Date: 2015-08-31 14:20:42
Message-ID: 1049479543.2276593.1441030842969.JavaMail.yahoo@mail.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Pavel Raiskup <praiskup(at)redhat(dot)com> wrote:

> It's been reported [1] that postmaster fails to start against staled
> postmaster.pid after (e.g.) power outage on Fedora, its due to init system
> parallelism and "some" other newly started process can already have allocated
> the same PID as the old postmaster had -- and in this case postmaster refuses
> to delete staled pidfile (which is expected as we need to be really
> careful).
>
> Don't you see some other possible check we could implement to guarantee that
> the PID mentioned in postmaster.pid does not hide concurrent postmaster?
Was the other newly started process another PostgreSQL cluster?
Was it launched under the same OS user? (Those are the only
conditions under which I've seen this.) I think it is wise to use
a separate OS user for each cluster.

If it's not a matter of multiple clusters running under the same OS
user, please provide more deails, like the specific version and
copy/paste of error messages and relevant log entries.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2015-08-31 14:31:13 Re: Buildfarm failure from overly noisy warning message
Previous Message Pavel Raiskup 2015-08-31 14:12:20 Better detection of staled postmaster.pid