Re: Interruptible sleeps (was Re: CommitFest 2009-07: Yay, Kevin! Thanks, reviewers!)

From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Interruptible sleeps (was Re: CommitFest 2009-07: Yay, Kevin! Thanks, reviewers!)
Date: 2010-08-23 21:30:04
Message-ID: 4C72E85C.3000201@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 20/08/10 17:28, Tom Lane wrote:
> [ It's way past time to change the thread title ]
>
> Heikki Linnakangas<heikki(dot)linnakangas(at)enterprisedb(dot)com> writes:
>> On 20/08/10 16:24, Tom Lane wrote:
>>> You keep on proposing solutions that only work for walsender :-(.
>
>> Well yes, the other places where we use pg_usleep() are not really a
>> problem as is.
>
> Well, yes they are. They cause unnecessary process wakeups and thereby
> consume cycles even when the database is idle. See for example a
> longstanding complaint here:
> https://bugzilla.redhat.com/show_bug.cgi?id=252129
>
> If we're going to go to the trouble of having a mechanism like this,
> I'd like it to fix that problem so I can close out that bug.

Hmm, if you want to put bgwriter and walwriter to deep sleep, then
someone will need to wake them up when they have work to do. Currently
they poll. Maybe they should just sleep longer, like 10 seconds, if
there hasn't been any work to do in the last X wakeups.

We've been designing the new sleep facility so that the event that wakes
up the sleep is sent from the signal handler in the same process, but it
seems that all the potential users would actually want to be woken up
from *another* process, so the signal handler seems like an unnecessary
middleman. Particularly on Windows where signals are simulated with
pipes and threads, while you could just send a Windows event directly
from one process to another.

A common feature that all the users of this facility want is that once
the event is sent, re-sending it is a fast no-op until re-enabled by the
receiver. For example, if we need backends to wake up bgwriter after
dirtying a buffer, you don't want to waste many cycles determining that
bgwriter is already active and doesn't need to be woken up.

Let's call these "latches". I'm thinking of something like this:

/* Similar to LWLockId */
typedef enum
{
BgwriterLatch,
WalwriterLatch,
/* plus one for each walsender */
} LatchId;

/*
* Wait for given latch to be set. Only one process can wait
* for a given latch at a time.
*/
WaitLatch(LatchId latch, long timeout);

/*
* Sets latch. Returns quickly if the latch is set already.
*/
SetLatch(LatchId latch);

/*
* Clear the latch. Calling WaitLatch after this will sleep, unless
* the latch is set again before the WaitLatch call.
*/
ResetLatch(LatchId latch);

There would be a boolean for each latch in shared memory, to indicate if
the latch is "armed", allowing quick return from SetLatch if the latch
is already set. Plus a signal to wake up the waiting process (maybe use
procsignal.c), and the self-pipe trick within the receiving process to
make it race condition free. On Windows, the signal and the self-pipe
trick are replaced with Windows events.

I'll try out this approach tomorrow..

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David E. Wheeler 2010-08-23 21:34:12 Re: WIP: extensible enums
Previous Message Eric Simon 2010-08-23 21:13:41 Problem Using PQcancel in a Synchronous Query