Re: fcntl(SETLK) [was Re: 2nd update on TOAST]

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Mike Mascari <mascarm(at)mascari(dot)com>
Cc: Alfred Perlstein <bright(at)wintelcom(dot)net>, Jan Wieck <JanWieck(at)Yahoo(dot)com>, PostgreSQL HACKERS <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: fcntl(SETLK) [was Re: 2nd update on TOAST]
Date: 2000-07-08 16:13:49
Message-ID: 25302.963072829@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Mike Mascari <mascarm(at)mascari(dot)com> writes:
> I don't get this. Isn't there a race condition here?

Strictly speaking, there is, but the race window is only a couple
of kernel calls wide, and as Bruce pointed out we do not need something
that is absolutely gold-plated bulletproof. We are just trying to
prevent dbadmins from accidentally starting two postmasters on the
same port number.

The way this would work is that pqcomm.c would do something like

if (socketFileAlreadyExists) {
try to open connection to existing postmaster;
if (successful) {
report port conflict and die;
}
delete existing socket file;
}
bind(socket); // kernel creates new socket file here
listen();

The race condition here is that if newly-started postmaster A has
executed bind() but not yet listen(), then newly-started postmaster B
could come along, observe the existing socket file, try to open
connection, fail, delete socket file, proceed. AFAIK B will be allowed
to bind() and create a new socket file, and A ends up listening to a
port that's lost in hyperspace --- no one else can ever connect to it
because it has no visible representative in the filesystem.

But as soon as A has executed listen() it's safe --- even though it's
not really ready to accept connections yet, the attempted connect from
B will wait till it does. (We should, therefore, use a plain vanilla
connect attempt for the probe --- no non-blocking connect or anything
fancy.)

The bind-to-listen delay in pqcomm.c is currently several lines long,
but there's no reason they couldn't be successive kernel calls with
nothing but a test for bind() failure between.

That strikes me as plenty close enough...

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2000-07-08 17:00:06 Re: crash in 7.0.2...
Previous Message The Hermit Hacker 2000-07-08 16:02:34 Re: Changes to handling version numbers internally