Re: [HACKERS] Weaker shmem interlock w/o postmaster.pid

From: Noah Misch <noah(at)leadboat(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: [HACKERS] Weaker shmem interlock w/o postmaster.pid
Date: 2018-08-12 06:48:15
Message-ID: 20180812064815.GB2301738@rfd.leadboat.com
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

On Wed, Sep 11, 2013 at 10:28:00PM -0400, Noah Misch wrote:
> On Wed, Sep 11, 2013 at 11:32:01AM -0400, Stephen Frost wrote:
> > * Noah Misch (noah(at)leadboat(dot)com) wrote:
> > > Concretely, that means
> > > not removing postmaster.pid on immediate shutdown in 9.3 and earlier. That's
> > > consistent with the rough nature of an immediate shutdown, anyway.

With 9.3 having a few months left, that's less interesting, but ...

> > I don't like leaving the postmaster.pid file around, even on an
> > immediate shutdown. I don't have any great suggestions regarding what
> > to do, given what we try to do wrt 'immediate', so perhaps it's
> > acceptable for future releases.
>
> Fair enough.

... we could still worry about folks doing "kill -9 `head -n1 postmaster.pid`
&& rm postmaster.pid". A possible defense is to record a copy of the shm
identifiers in a second file ("sysv_shmem_key"?) within the data directory.
Since users would have no expectations about a new file's lifecycle, we could
remove it when and only when we verify a segment doesn't exist. However, a
DBA determined to remove postmaster.pid would likely remove both files.

> > > I'm thinking to preserve postmaster.pid at immediate shutdown in all released
> > > versions, but I'm less sure about back-patching a change to make
> > > PGSharedMemoryCreate() pickier. On the one hand, allowing startup to proceed
> > > with backends still active in the same data directory is a corruption hazard.
> >
> > The corruption risk, imv anyway, is sufficient to backpatch the change
> > and overrides the concerns around very fast shutdown/restarts.
>
> Making PGSharedMemoryCreate() pickier in all branches will greatly diminish
> the marginal value of preserving postmaster.pid, so I'm fine with dropping the
> postmaster.pid side of the proposal.

Here's an implementation. The first, small patch replaces an obsolete
errhint() about manually removing shared memory blocks. In its place,
recommend killing processes. Today's text, added in commit 4d14fe0, is too
rarely pertinent to justify a HINT. (You might use ipcrm if a PostgreSQL
process is stuck in D state and has SIGKILL pending, so it will never become
runnable again.) Removal of the ipcclean script ten years ago (39627b1) and
its non-replacement corroborate that manual shm removal is now a niche goal.

In the main patch, one of the tests covers an unsafe case that sysv_shmem.c
still doesn't detect. I could pursue a fix via the aforementioned
sysv_shmem_key file, modulo the possibility of a DBA removing it. I could
also, when postmaster.pid is missing, make sysv_shmem.c check the first N
(N=100?) keys applicable to the selected port. My gut feeling is that neither
thing is worthwhile, but I'm interested to hear other opinions.

This removes the creatorPID test, which commit c715fde added before commit
4d14fe0 added the shm_nattch test. The pair of tests is not materially
stronger than the shm_nattch test by itself. For reasons explained in the
comment at "enum IpcMemoryState", this patch has us cease recycling segments
pertaining to data directories other than our own. This isn't ideal for the
buildfarm, where a postmaster crash bug would permanently leak a shm segment.
I think the benefits for production use cases eclipse this drawback. Did I
miss a reason to like the old behavior?

Single-user mode has been protected even less than normal execution, because
it selected a shm key as though using port zero. Thus, starting single-user
mode after an incomplete shutdown would never detect the pre-shutdown shm.
The patch makes single-user mode work like normal execution. Until commit
de98a7e, single-user mode used malloc(), not a shm segment. That commit also
restricted single-user mode to not recycle old segments. I didn't find a
rationale for that restriction, but a contributing factor may have been the
fact that every single-user process uses the same port-0 shm key space. Also,
initdb starts various single-user backends, and reaping stale segments would
be an odd side effect for initdb. With single-user mode using a real port
number and PostgreSQL no longer recycling segments of other data directories,
I removed that restriction. By the way, as of this writing, my regular
development system has 41 stale segments, all in the single-user key space
(0x00000001 to 0x0000002d with gaps). It's clear how they persisted once
leaked, but I don't know how I leaked them in the first place.

I ran 9327 iterations of the test file, and it did not leak a shm segment in
that time. I tested on Red Hat and on Windows Server 2016; I won't be shocked
if the test (not the code under test) breaks on other Windows configurations.
The patch adds PostgresNode features to enable that test coverage.

The comment referring to a CreateDataDirLockFile() bug refers to this one:
https://postgr.es/m/flat/20120803145635.GE9683%40tornado.leadboat.com

Thanks,
nm

Attachment Content-Type Size
ipcrm-hint-v1.patch text/plain 816 bytes
PGSharedMemoryCreate-safety-v1.patch text/plain 32.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fabien COELHO 2018-08-12 07:27:31 Re: Allowing printf("%m") only where it actually works
Previous Message Noah Misch 2018-08-12 06:26:15 Re: libpq should not look up all host addresses at once