Re: postgresql-[any version] from FreeBSD ports - startup problems after crash

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Ruslan A Dautkhanov <rusland(at)scn(dot)ru>
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: postgresql-[any version] from FreeBSD ports - startup problems after crash
Date: 2006-05-15 13:23:33
Message-ID: 2289.1147699413@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Ruslan A Dautkhanov <rusland(at)scn(dot)ru> writes:
> Server rebooted occasionally after power failure.
> And I have stale postmaster.pid file, so postmaster didn't start with error
> bill postgres[600]: [1-1] FATAL: file "postmaster.pid" already exists

You probably need a newer postgres version (you didn't say what you are
using) and/or a more carefully written start script.

Your proposed change in the start script is useless --- do you think the
postmaster doesn't check that already? Furthermore, it's actually
dangerous for reasons we need not get into here; suffice to say that
automated removal of that lock file is NOT a good idea.

The problem comes up when the startup timing is just different enough
that the PID belonging to the postmaster in the previous boot cycle now
belongs to the shell that's launching it. The postmaster sees a live
process of the correct userid (ie, postgres) and has to assume that
that's a pre-existing postmaster.

We've fixed this in recent releases by having the postmaster also check
for a match to its parent process ID (getppid). The care in the start
script comes because this only works for one level up. Therefore, you
can't "su -c pg_ctl start ..." because that would create three levels of
postgres-owned processes (shell, pg_ctl, postmaster) and if the PID
count is off by 2 instead of 1 then we still lose. You have to invoke
the postmaster directly, "su -c postmaster ...". (Hm, actually it might
work to do "su -c 'exec pg_ctl ...'" ... I have not tried that.)

regards, tom lane

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2006-05-15 13:48:51 Re: BUG #2436: cannot --enable-thread-safety on -lpthread host
Previous Message Dave Page 2006-05-15 12:27:50 Re: BUG #2439: pgAdmin III v1.4.1 fails to compile with GCC 4.1.0