Weaker shmem interlock w/o postmaster.pid

From: Noah Misch <noah(at)leadboat(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Weaker shmem interlock w/o postmaster.pid
Date: 2013-09-11 03:33:41
Message-ID: 20130911033341.GD225735@tornado.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

If a starting postmaster's CreateLockFile() finds an existing postmaster.pid,
it subjects the shared memory segment named therein to the careful scrutiny of
PGSharedMemoryIsInUse(). If that segment matches the current data directory
and has any attached processes, we bail with the "pre-existing shared memory
block ... is still in use" error. When the postmaster.pid file is missing,
there's inherently less we can do to reliably detect this situation; in
particular, an old postmaster could have chosen an unusual key due to the
usual 1+(port*1000) key being in use. That being said, PGSharedMemoryCreate()
typically will stumble upon the old segment, and it (its sysv variant, anyway)
applies checks much weaker than those of PGSharedMemoryIsInUse(). If the
segment has a PGShmemHeader and the postmaster PID named in that header is not
alive, PGSharedMemoryCreate() will delete the segment and proceed. Shouldn't
it instead check the same things as PGSharedMemoryIsInUse()?

The concrete situation in which I encountered this involved PostgreSQL 9.2 and
an immediate shutdown with a backend that had blocked SIGQUIT. The backend
survived the immediate shutdown as one would expect. The postmaster
nonetheless removed postmaster.pid before exiting, and I could immediately
restart PostgreSQL despite the survival of the SIGQUIT-blocked backend. If I
instead SIGKILL the postmaster, postmaster.pid remains, and I must kill stray
backends before restarting. The postmaster should not remove postmaster.pid
unless it has verified that its children have exited. Concretely, that means
not removing postmaster.pid on immediate shutdown in 9.3 and earlier. That's
consistent with the rough nature of an immediate shutdown, anyway.

I'm thinking to preserve postmaster.pid at immediate shutdown in all released
versions, but I'm less sure about back-patching a change to make
PGSharedMemoryCreate() pickier. On the one hand, allowing startup to proceed
with backends still active in the same data directory is a corruption hazard.
On the other hand, it could break weird shutdown/restart patterns that permit
trivial lifespan overlap between backends of different postmasters. Opinions?

Thanks,
nm

--
Noah Misch
EnterpriseDB http://www.enterprisedb.com

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Satoshi Nagayasu 2013-09-11 03:43:41 Re: New statistics for WAL buffer dirty writes
Previous Message Noah Misch 2013-09-11 03:32:03 Re: Valgrind Memcheck support