Re: Weaker shmem interlock w/o postmaster.pid

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Weaker shmem interlock w/o postmaster.pid
Date: 2013-09-11 15:32:01
Message-ID: 20130911153200.GD2706@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

* Noah Misch (noah(at)leadboat(dot)com) wrote:
> Shouldn't it instead check the same things as PGSharedMemoryIsInUse()?

Offhand, I tend to agree that we should really be doing a very careful
job of looking at if an existing segment is still in use.

> The concrete situation in which I encountered this involved PostgreSQL 9.2 and
> an immediate shutdown with a backend that had blocked SIGQUIT. The backend
> survived the immediate shutdown as one would expect.

Well.. We expect this now because of the analysis you did in the
adjacent thread showing how it can happen.

> The postmaster
> nonetheless removed postmaster.pid before exiting, and I could immediately
> restart PostgreSQL despite the survival of the SIGQUIT-blocked backend. If I
> instead SIGKILL the postmaster, postmaster.pid remains, and I must kill stray
> backends before restarting. The postmaster should not remove postmaster.pid
> unless it has verified that its children have exited.

This makes sense, however..

> Concretely, that means
> not removing postmaster.pid on immediate shutdown in 9.3 and earlier. That's
> consistent with the rough nature of an immediate shutdown, anyway.

I don't like leaving the postmaster.pid file around, even on an
immediate shutdown. I don't have any great suggestions regarding what
to do, given what we try to do wrt 'immediate', so perhaps it's
acceptable for future releases.

> I'm thinking to preserve postmaster.pid at immediate shutdown in all released
> versions, but I'm less sure about back-patching a change to make
> PGSharedMemoryCreate() pickier. On the one hand, allowing startup to proceed
> with backends still active in the same data directory is a corruption hazard.

The corruption risk, imv anyway, is sufficient to backpatch the change
and overrides the concerns around very fast shutdown/restarts.

Thanks,

Stephen

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2013-09-11 15:43:07 Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
Previous Message Bruce Momjian 2013-09-11 15:27:42 Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers