Skip site navigation (1) Skip section navigation (2)

Re: POSIX shared memory support

From: Chris Marcellino <cmarcellino(at)apple(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-patches(at)postgresql(dot)org
Subject: Re: POSIX shared memory support
Date: 2007-02-27 07:29:10
Message-ID: (view raw, whole thread or download thread mbox)
Lists: pgsql-patches
On Feb 26, 2007, at 10:43 PM, Tom Lane wrote:

> Chris Marcellino <cmarcellino(at)apple(dot)com> writes:
>> The System V shared memory facilities provide a method to determine
>> who is attached to a shared memory segment.
>> This is used to prevent backends that were orphaned by crashed or
>> killed database processes from corrupting the data-
>> base as it is restarted. The same effect can be achieved with using
>> the POSIX APIs,
> ... except that it can't ...
>> but since the POSIX library does not
>> have a way to check who is attached to a segment, atomic segment
>> creation must be used to ensure exclusive access to
>> the database.
> How does that fix the problem?  If you can't actually tell whether
> someone is attached to an existing segment, then you're still up  
> against
> the basic rock-and-a-hard-place issue: either you assume there is  
> no one
> there (and corrupt your database if you're wrong) or you assume  
> there is
> someone there (and force manual intervention by the DBA to recover  
> after
> postmaster crashes).  Neither of these alternatives is really  
> acceptable.

Ignoring the case where backends are still alive in the database,  
since they would require intervention or patience either way, there  
are two options:
1) There is a postmaster/backend still running and you try to start  
another postmaster: the unique segment cannot be closed and  
atomically recreated and will fail as it does in the current  
2) There are no errant processes still in the database: the segment  
can be closed and atomically recreated.

Try making a build with the patch, then start a postmaster for a  
given folder, delete the lock file and start another postmaster (on a  
different port) in that folder. Please let me know if I am  
overlooking something.

>> In order for this to work, the key name used to open and create the
>> shared memory segment must be unique for each
>> data directory. This is done by using a strong hash of the canonical
>> form of the data directory’s pathname.
> "Strong hash" is not a guarantee, even if you could promise that you
> could get a unique canonical path, which I doubt you can.  In any case
> this fails if the DBA decides to rename the directory on the fly  
> (don't
> laugh; not only are there instances of that in our archives, there are
> people opining that we need to allow it --- even with the postmaster
> still running).

Strong hash is an effective guarantee that many computing paradigms  
are based upon. The collision rate is astronomically small, and can  
be made astronomically smaller with longer hashes.
(For MD5 there would need to be 10^15 postmasters on a server before  
a collision is likely, and they all would need to have crashed and  
left backends in the database, etc. )

True, renaming is a problem that I had had not anticipated at all.  
Now that you mention it, hard links might be an issue on some  
machines that don't canonicalize them to a unique path, since that  
isn't required by the POSIX docs. Oh, the horrible degenerate cases.  
Good point though.

Perhaps there is some other unique identifying feature of a given  
database. A per-database persistent UUID would fit nicely here. It  
could just be the shmem key.

>> This also re-
>> moves any risk of other applications, or other databases’ memory
>> segments colliding with the current shared memory
>> segment, which conveniently simplifies the logic.
> How exactly does it remove that risk?

This is fruitless due to the renaming issue, but the hash isn't an  
issue. I'm not sure that a hex string beginning with \pg_xxxxx is any  
less readable than the shmem id integers that are generated ad-hoc by  
the current implementation.

> I think you're wishfully-thinking
> that if you are creating an unreadable hash value then there will  
> never
> be any collisions against someone else with the same touching faith  
> that
> *his* unreadable hash values will never collide with anyone else's.

I'm flattered that you hold my coding abilities with such devout  
conviction, but I assure you that cryptography, even in this limited  
use, is based in rational thought :).
In addition, the astronomically unlikely collision isn't a risk as  
the database can't be damaged. The admin would then need to clear the  
lockfile, after he won the lottery twice and was stuck by lightning  
in his overturned car.

> Doesn't give me a lot of comfort.
> Not that it matters, since the
> approach is broken even if this specific assumption were sustainable.

Postmasters failing to load don't give me much comfort either, and  
that isn't a pipe dream.

I suppose that the renaming issue relegates this patch to situations  
where the database cannot be renamed or hard linked to and started  
more than once, yet require this to start up databases without  
restarting and needing to control how many other databases are  
consuming shmem on the same box.

Thanks for the reply,
Chris Marcellino

> 			regards, tom lane

In response to


pgsql-patches by date

Next:From: Simon RiggsDate: 2007-02-27 07:49:01
Subject: Re: Dead Space Map version 2
Previous:From: Tom LaneDate: 2007-02-27 06:43:11
Subject: Re: POSIX shared memory support

Privacy Policy | About PostgreSQL
Copyright © 1996-2017 The PostgreSQL Global Development Group