Re: FlexLocks

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: simon(at)2ndquadrant(dot)com, alvherre(at)commandprompt(dot)com, pgsql-hackers(at)postgresql(dot)org, tgl(at)sss(dot)pgh(dot)pa(dot)us
Subject: Re: FlexLocks
Date: 2011-11-30 20:26:54
Message-ID: CA+TgmoaE5MSkt-Ey4oKwLTVYHw5-pV1muAimrF_x4=foh6e=mQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Nov 23, 2011 at 7:18 PM, Kevin Grittner
<Kevin(dot)Grittner(at)wicourts(dot)gov> wrote:
> Why is it OK to drop these lines from the else condition in
> ProcArrayEndTransaction()?:
>
>        /* must be cleared with xid/xmin: */
>        proc->vacuumFlags &= ~PROC_VACUUM_STATE_MASK;

It's probably not. Oops.

I believe the attached patch versions address your comments regarding
the flexlock patch as well; it is also rebased over the PGXACT patch,
which has since been committed.

> The extraWaits code still looks like black magic to me, so unless
> someone can point me in the right direction to really understand
> that, I can't address whether it's OK.

I don't think I've changed the behavior, so it should be fine. The
idea is that something like this can happen:

1. Backend #1 does some action which will eventually cause some other
process to send it a wakeup (like adding itself to the wait-queue for
a heavyweight lock).
2. Before actually going to sleep, backend #1 tries to acquire an
LWLock. The LWLock is not immediately available, so it sleeps on its
process semaphore.
3. Backend #2 sees the shared memory state created in step one and
decides sends a wakeup to backend #1 (for example, because the lock is
now available).
4. Backend #1 receives the wakeup. It duly reacquires the spinlock
protecting the LWLock, sees that the LWLock is not available, releases
the spinlock, and goes back to sleep.
5. Backend #3 now releases the LWLock that backend #1 is trying to
acquire and, as backend #1 is first in line, it sends backend #1 a
wakeup.
6. Backend #1 now wakes up again, reacquires the spinlock, gets the
lwlock, releases the spinlock, does some stuff, and releases the
lwlock.
7. Backend #1, having now finished what it needed to do while holding
the lwlock, is ready to go to sleep and wait for the event that it
queued up for back in step #1. However, the wakeup for that event
*has already arrived* and was consumed by the LWLock machinery. So
when backend #1 goes to sleep, it's waiting for a wakeup that will
never arrive, because it already did arrive, and got eaten.

The solution is the "extraWaits" thing; in step #6, we remember that
we received an extra, useless wakeup in step #4 that we threw away.
To make up for having thrown away a wakeup someone else sent us in
step #3, we send ourselves a wakeup in step #6. That way, when we go
to sleep in step #7, we wake up immediately, just as we should.

> The need to modify flexlock_internals.h and flexlock.c seems to me to
> indicate a lack of desirable modularity here.  The lower level object
> type shouldn't need to know about each and every implementation of a
> higher level type which uses it, particularly not compiled in like
> that.  It would be really nice if each of the higher level types
> "registered" with flexlock at runtime, so that the areas modified at
> the flexlock level in this patch file could be generic.  Among other
> things, this could allow extensions to use specialized types, which
> seems possibly useful.  Does that (or some other technique to address
> the concern) seem feasible?

Possibly; let me think about that. I haven't addressed that in this version.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment Content-Type Size
flexlock-v3.patch application/octet-stream 70.2 KB
procarraylock-v2.patch application/octet-stream 33.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2011-11-30 20:29:02 Re: WIP: index support for regexp search
Previous Message Jesper Krogh 2011-11-30 20:15:40 Accounting for toast in query planner. (gin/gist indexes).