Re: Interruptible sleeps (was Re: CommitFest 2009-07: Yay, Kevin! Thanks, reviewers!)

From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Interruptible sleeps (was Re: CommitFest 2009-07: Yay, Kevin! Thanks, reviewers!)
Date: 2010-09-03 10:50:54
Message-ID: 4C80D30E.2010504@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 02/09/10 23:13, Tom Lane wrote:
> The WaitLatch ...timeout API could use a bit of refinement. I'd suggest
> defining negative timeout as meaning wait forever, so that timeout = 0
> can be used for "check but don't wait". Also, it seems like the
> function shouldn't just return void but should return a bool to show
> whether it saw the latch set or timed out.

In case of WaitLatchOrSocket, the caller might want to know if a latch
was set, the socket became readable, or it timed out. So we need three
different return values.

> (Yeah, I realize the caller
> could look into the latch to find that out, but callers really ought to
> treat latches as opaque structs.)

Hmm, maybe we need a TestLatch function to check if a latch is set.

> I don't think you have the select-failed logic right in
> WaitLatchOrSocket; on EINTR it will suppose that FD_ISSET is a valid
> test to make, which I think ain't the case. Just "continue" around
> the loop.

Yep.

I also realized that the timeout handling is a bit surprising with
interrupts. After EINTR we call select() again with the same timeout, so
a signal effectively restarts the timer. We seem to have similar
behavior in a couple of other places, in pgstat.c and auth.c. So maybe
that's OK and just needs to be documented, but I thought I'd bring it up.

> It seems like both implementations are #include'ing more than they
> ought to --- why replication/walsender.h, in particular?

Windows implementation needs it for the max_wal_senders variable, to
allocate enough shared Event objects in LatchShmemInit. In unix_latch.c
it's not needed.

> Also, using sig_atomic_t for owner_pid is entirely not sane.
> On many platforms sig_atomic_t is only a byte, and besides
> which you have no need for that field to be settable by a
> signal handler.

Hmm, true, it doesn't need to be set from signal handler, but is there
an atomicity problem if one process calls ReleaseLatch while another
process is in SetLatch? ReleaseLatch sets owner_pid to 0, while SetLatch
reads it and calls kill() on it. Can we assume that pid_t is atomic, or
do we need a spinlock to protect it? (Windows implementation has a
similar issue with HANDLE instead of pid_t)

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2010-09-03 11:00:31 Re: thousand unrelated data files in pg_default tablespace
Previous Message Heikki Linnakangas 2010-09-03 10:31:07 Re: Synchronous replication - patch status inquiry