Re: FATAL: bogus data in lock file "postmaster.pid": ""

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Michael Beattie <mtbeedee(at)gmail(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: FATAL: bogus data in lock file "postmaster.pid": ""
Date: 2012-08-28 01:49:27
Message-ID: 20120828014927.GB6786@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Aug 27, 2012 at 07:39:35PM -0400, Tom Lane wrote:
> Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> writes:
> > How about having it sleep for a short while, then try again?
>
> I could get behind that, but I don't think the delay should be more than
> 100ms or so. It's important for the postmaster to acquire the lock (or
> not) pretty quickly, or pg_ctl is going to get confused. If we keep it
> short, we can also dispense with the log spam you were suggesting.
>
> (Actually, I wonder if this type of scenario isn't going to confuse
> pg_ctl already --- it might think the lockfile belongs to the postmaster
> *it* started, not some pre-existing one. Does that matter?)

I took Alvaro's approach of a sleep. The file test was already in a
loop that went 100 times. Basically, if the lock file exists, this
postmaster isn't going to succeed, so I figured there is no reason to
rush in the testing. I gave it 5 tries with one second between
attempts. Either the file is being populated, or it is stale and empty.

I checked pg_ctl and that has a default wait of 60 second, so 5 seconds
to exit out of the postmaster should be fine.

Patch attached.

FYI, I noticed we have a similar 5-second creation time requirement in
pg_ctl:

/*
* The postmaster should create postmaster.pid very soon after being
* started. If it's not there after we've waited 5 or more seconds,
* assume startup failed and give up waiting. (Note this covers both
* cases where the pidfile was never created, and where it was created
* and then removed during postmaster exit.) Also, if there *is* a
* file there but it appears stale, issue a suitable warning and give
* up waiting.
*/
if (i >= 5)

This is for the case where the file has an old pid, rather than it is
empty.

FYI, I fixed the filename problem Tom found.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

Attachment Content-Type Size
pid.diff text/x-diff 1.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2012-08-28 01:59:10 Re: FATAL: bogus data in lock file "postmaster.pid": ""
Previous Message Mali Akmanalp 2012-08-28 00:07:02 Tablefunc crosstab error messages