Re: WALInsertLock contention

From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: WALInsertLock contention
Date: 2011-06-08 14:18:53
Message-ID: BANLkTi=hWpaB+uGwEdUvsM2OdyQaL4OnOg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jun 8, 2011 at 7:44 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Wed, Jun 8, 2011 at 1:59 AM, Merlin Moncure <mmoncure(at)gmail(dot)com> wrote:
>> There's probably an obvious explanation that I'm not seeing, ...
>
> Yep.  :-)
>
>> but if
>> you're not delegating the work of writing the buffers out to someone
>> else, why do you need to lock the per backend buffer at all?  That is,
>> why does it have to be in shared memory?  Suppose that if the
>> following are true:
>> *) Writing qualifying data (non commit, non switch)
>> *) There is room left in whatever you are copying to
>> you could trylock WalInsertLock, and if failing to get it, just copy
>> qualifying data into a private buffer and punt if the following are
>> true...otherwise just do the current behavior.
>
> And here it is: Writing a buffer requires a write & fsync of WAL
> through the buffer LSN.  If the WAL for the buffers were completely
> inaccessible to other backends, then those buffers would be pinned in
> shared memory.  Which would make things very difficult at buffer
> eviction time, or for checkpoints.

Well, (bear with me here) I'm not giving up that easy. Pinning a
judiciously small amount buffers into shared memory so you can recuce
congestion on the insert lock might be an acceptable trade-off in high
contention scenarios...in fact I assumed that was the whole point of
your original idea, which I still think has tremendous potential.
Obviously, you wouldn't want more than a very small percentage of
shared buffers overall (say 1-10% max) to be pinned in this way. The
trylock is an attempt to cap the downside case so that you aren't
unnecessarily pinning buffers in say, long running i/o bound
transactions where insert lock contention is low. Maybe you could
experiment with very small private insert buffer sizes (say 64 kb)
that would hopefully provide some of the benefits (if there are in
fact any) and mitigate potential costs. Another tweak you could make
is that, once having trylocked and failed in a transaction and failed
acquirement, you always punt from there on in until you fill up or
need to block per ordering requirements. Or maybe the whole thing
doesn't help at all...just trying to understand the problem better.

> At any rate, even if it were possible to make it work, it'd be a
> misplaced optimization.  It isn't touching shared memory - or even
> touching the LWLock - that's expensive; it's the LWLock contention
> that kills you, either because stuff blocks, or just because the CPUs
> burn a lot of cycles fighting over cache lines.  An LWLock that is
> typically taken by only one backend at a time is pretty cheap.  I
> suppose I couldn't afford to be so blasé if we were trying to scale to
> 2048-core systems where even inserting a memory barrier is expensive
> enough to worry about, but we've got a ways to go before we need to
> start worrying about that.

Right -- although it isn't so much of an optimization (although you
still want to do everything reasonable to keep work under the lock as
light as possible, and shm->shm copy is going to be slower than
mem->shm) as a simplification trade-off. You don't have to worry
about deadlocks messing around with your per backend buffers during
your internal 'flush', and it's generally just easier messing around
with private memory (less code, less locking, less everything).

One point i'm missing though. Getting back to your original idea, how
does writing to shmem prevent you from having to keep buffers pinned?
I'm reading your comment here:
"Those buffers are stamped with a fake LSN that
points back to the per-backend WAL buffer, and they can't be written
until the WAL has been moved from the per-backend WAL buffers to the
main WAL buffers."

That suggests to me that you have to keep them pinned anyways. I'm
still a bit fuzzy on how the per-backend buffers being in shm conveys
any advantage. IOW, (trying not to be obtuse) under what
circumstances would backend A want to read from or (especially) write
to backend B's wal buffer?

merlin

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kevin Grittner 2011-06-08 14:46:08 Re: SIREAD lock versus ACCESS EXCLUSIVE lock
Previous Message Robert Haas 2011-06-08 12:52:24 Re: SSI heap_insert and page-level predicate locks