Re: fcntl(SETLK) [was Re: 2nd update on TOAST]

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>, Jan Wieck <JanWieck(at)Yahoo(dot)com>, Philip Warner <pjw(at)rhyme(dot)com(dot)au>, PostgreSQL HACKERS <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: fcntl(SETLK) [was Re: 2nd update on TOAST]
Date: 2000-07-07 23:00:47
Message-ID: 13925.963010847@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Peter Eisentraut <peter_e(at)gmx(dot)net> writes:
> Quoth the file system standard:

> `sharedstatedir'
> The directory for installing architecture-independent data files
> which the programs modify while they run. This should normally be
> `/usr/local/com', but write it as `$(prefix)/com'. (If you are
> using Autoconf, write it as `(at)sharedstatedir@'.)

> The problem with this approach is making that directory writeable by the
> server account.

The lock directory should certainly be one used only for Postgres locks,
owned by postgres user and writable only by postgres user.

> 2) Making initdb executable as root but with some --user switch. Have it
> create a subdirectory of $sharedstatedir writable by the server
> account, possibly with sticky bit and whatnot. Use `su' to invoke
> `postgres'.

> This approach might be convenient also in terms of creating the data
> directory.

We could do that, or we could just say "you must have arranged for
creation of these directories before you run initdb". For the truly
lazy, a small script that could be executed as root could be provided.

Personally I'd be unwilling to run a script as complex as initdb as
root; what if it goes wrong? Keep the stuff that requires root
permission separate, and as small as possible.

BTW, regardless of where exactly the lock directory lives (and IIRC
there were several schools of thought on that), I believe that the
lock directory pathname has to be wired in at configure time. It
can't be an initdb argument because the whole locking thing is useless
unless all the PG installations on a machine agree on where the port
locks are.

> Btw., what would happen if we did start a second postmaster at the same
> TCP port? Or more interestingly, what happens if some completely different
> program already runs at that port? How do we protect against that? This
> has something to do with SO_REUSEADDR, but I don't understand those things
> too well.

SO_REUSEADDR solves the problem for TCP sockets. The problem with Unix
sockets is that the kernel's detection of conflicts is pretty braindead:
if there is an existing socket file of the same name, you get an
"address in use" failure from bind(), regardless of whether anyone else
is actually using the socket. So, if the previous postmaster died
ungracefully and didn't delete its socket file, a new postmaster cannot
be started up until the old socket file is removed. What we're trying
to do here is automate that removal so the admin doesn't have to do it.
The trouble is we can't just unlink() the old socket file because
that'll succeed even if there is a postmaster actively using the socket!
So we need to find out whether the old postmaster is still alive
to decide whether it's OK to remove the old socket file or whether we
should abort startup.

Bruce and I were just talking by phone about this, and we realized that
there is a completely different approach to making that decision: if you
want to know whether there's an old postmaster connected to a socket
file, try to connect to the old postmaster! In other words, pretend to
be a client and see if your connection attempt is answered. (You don't
have to try to log in, just see if you get a connection.) This might
also answer Peter's concern about socket files that belong to
non-Postgres programs, although I doubt that's really a big issue.

There are some potential pitfalls here, like what if the old postmaster
is there but overloaded? But on the whole it seems like it might be
a cleaner answer than fooling around with lockfiles, and certainly safer
than relying on fcntl(SETLK) to work on a socket file. Comments anyone?

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2000-07-07 23:22:45 Re: Type formatting and oidvectortypes
Previous Message Tom Lane 2000-07-07 21:56:12 Re: Re: [SQL] Re: [GENERAL] lztext and compression ratios...