Re: Latch implementation that wakes on postmaster death on both win32 and Unix

From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Peter Geoghegan <peter(at)2ndquadrant(dot)com>
Cc: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>, Florian Pflug <fgp(at)phlo(dot)org>
Subject: Re: Latch implementation that wakes on postmaster death on both win32 and Unix
Date: 2011-07-04 15:53:43
Message-ID: 4E11E207.5080109@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Ok, here's a new patch, addressing the issues Fujii raised, and with a
bunch of stylistic changes of my own. Also, I committed a patch to
remove silent_mode, so the fork_process() changes are now gone. I'm
going to sleep over this and review once again tomorrow, and commit if
it still looks good to me and no-one else reports new issues.

There's two small issues left:

I don't like the names POSTMASTER_FD_WATCH and POSTMASTER_FD_OWN. At a
quick glance, it's not at all clear which is which. I couldn't come up
with better names, so for now I just added some comments to clarify
that. I would find WRITE/READ more clear, but to make sense of that you
need to how the pipe is used. Any suggestions or opinions on that?

The BUGS section of Linux man page for select(2) says:

> Under Linux, select() may report a socket file descriptor as "ready for
> reading", while nevertheless a subsequent read blocks. This could for
> example happen when data has arrived but upon examination has wrong
> checksum and is discarded. There may be other circumstances in which a
> file descriptor is spuriously reported as ready. Thus it may be safer
> to use O_NONBLOCK on sockets that should not block.

So in theory, on Linux you might WaitLatch might sometimes incorrectly
return WL_POSTMASTER_DEATH. None of the callers check for
WL_POSTMASTER_DEATH return code, they call PostmasterIsAlive() before
assuming the postmaster has died, so that won't affect correctness at
the moment. I doubt that scenario can even happen in our case, select()
on a pipe that is never written to. But maybe we should add add an
assertion to WaitLatch to assert that if select() reports that the
postmaster pipe has been closed, PostmasterIsAlive() also returns false.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

Attachment Content-Type Size
new_latch-v7.2.patch text/x-diff 28.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2011-07-04 16:02:12 Re: BUG #6083: psql script line numbers incorrectly count \copy data
Previous Message Peter Eisentraut 2011-07-04 15:29:05 proper format for printing GetLastError()