Re: Wait free LW_SHARED acquisition - v0.9

From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: Wait free LW_SHARED acquisition - v0.9
Date: 2014-10-14 13:40:49
Message-ID: CAHyXU0xJxcOo-Erw8z-gAPDUbxf_1giO9UQAnvB4fkk4FzSZgg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Oct 10, 2014 at 11:00 AM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> On 2014-10-10 16:41:39 +0200, Andres Freund wrote:
>> FWIW, the profile always looks like
>> - 48.61% postgres postgres [.] s_lock
>> - s_lock
>> + 96.67% StrategyGetBuffer
>> + 1.19% UnpinBuffer
>> + 0.90% PinBuffer
>> + 0.70% hash_search_with_hash_value
>> + 3.11% postgres postgres [.] GetSnapshotData
>> + 2.47% postgres postgres [.] StrategyGetBuffer
>> + 1.93% postgres [kernel.kallsyms] [k] copy_user_generic_string
>> + 1.28% postgres postgres [.] hash_search_with_hash_value
>> - 1.27% postgres postgres [.] LWLockAttemptLock
>> - LWLockAttemptLock
>> - 97.78% LWLockAcquire
>> + 38.76% ReadBuffer_common
>> + 28.62% _bt_getbuf
>> + 8.59% _bt_relandgetbuf
>> + 6.25% GetSnapshotData
>> + 5.93% VirtualXactLockTableInsert
>> + 3.95% VirtualXactLockTableCleanup
>> + 2.35% index_fetch_heap
>> + 1.66% StartBufferIO
>> + 1.56% LockReleaseAll
>> + 1.55% _bt_next
>> + 0.78% LockAcquireExtended
>> + 1.47% _bt_next
>> + 0.75% _bt_relandgetbuf
>>
>> to me. Now that's with the client count 496, but it's similar with lower
>> counts.
>>
>> BTW, that profile *clearly* indicates we should make StrategyGetBuffer()
>> smarter.
>
> Which is nearly trivial now that atomics are in. Check out the attached
> WIP patch which eliminates the spinlock from StrategyGetBuffer() unless
> there's buffers on the freelist.

Is this safe?

+ victim = pg_atomic_fetch_add_u32(&StrategyControl->nextVictimBuffer, 1);

- if (++StrategyControl->nextVictimBuffer >= NBuffers)
+ buf = &BufferDescriptors[victim % NBuffers];
+
+ if (victim % NBuffers == 0)
<snip>

I don't think there's any guarantee you could keep nextVictimBuffer
from wandering off the end. You could buy a little safety by CAS'ing
zero in instead, followed by atomic increment. However that's still
pretty dodgy IMO and requires two atomic ops which will underperform
the spin in some situations.

I like Robert's idea to keep the spinlock but preallocate a small
amount of buffers, say 8 or 16.

merlin

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2014-10-14 13:58:43 Re: Wait free LW_SHARED acquisition - v0.9
Previous Message Sawada Masahiko 2014-10-14 13:32:10 Drop any statistics of table after it's truncated