Re: Proposal of tunable fix for scalability of 8.4

From: Scott Carey <scott(at)richrelevance(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Greg Smith <gsmith(at)gregsmith(dot)com>, "Jignesh K(dot) Shah" <J(dot)K(dot)Shah(at)sun(dot)com>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Proposal of tunable fix for scalability of 8.4
Date: 2009-03-19 20:57:03
Message-ID: C5E7FFAF.3874%scott@richrelevance.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 3/18/09 2:25 PM, "Robert Haas" <robertmhaas(at)gmail(dot)com> wrote:

> On Wed, Mar 18, 2009 at 1:43 PM, Scott Carey <scott(at)richrelevance(dot)com> wrote:
>>>> Its worth ruling out given that even if the likelihood is small, the fix is
>>>> easy.  However, I don¹t see the throughput drop from peak as more
>>>> concurrency is added that is the hallmark of this problem < usually with a
>>>> lot of context switching and a sudden increase in CPU use per transaction.
>>>
>>> The problem is that the proposed "fix" bears a strong resemblence to
>>> attempting to improve your gas mileage by removing a few non-critical
>>> parts from your card, like, say, the bumpers, muffler, turn signals,
>>> windshield wipers, and emergency brake.
>>
>> The fix I was referring to as easy was using a connection pooler -- as a
>> reply to the previous post. Even if its a low likelihood that the connection
>> pooler fixes this case, its worth looking at.
>
> Oh, OK. There seem to be some smart people saying that's a pretty
> high-likelihood fix. I thought you were talking about the proposed
> locking change.
>

Sorry for the confusion, I was countering the contention that a connection
pool would fix all of this, and gave that low likelihood of removing the
lock contention given the results of the first set of data and its linear
ramp-up.

I frankly think it is extremely unlikely given the test results that
figuring out how to run this with 64 threads (instead of the current linear
ramp up to 128) will give 100% CPU utilization.
Any system that gets 100% CPU utilization with CPU_COUNT concurrent
processes or threads and only 35% with CPU_COUNT*2 would be seriously flawed
anyway... The only reasonable reasons for this I can think of would be if
each one used enough memory to cause swapping or something else that forces
disk i/o.

Granted, that Postgres isn't perfect and there is overhead for idle, tiny
connections, handling CPU_COUNT*2 connections with half idle and half active
as the current test case does, does not invalidate the test -- it makes it
realistic.
A 64 thread test case that can spend zero time in the client would be useful
to provide more information however.

>>> While it's true that the car
>>> might be drivable in that condition (as long as nothing unexpected
>>> happens), you're going to have a hard time convincing the manufacturer
>>> to offer that as an options package.
>>
>> The original poster's request is for a config parameter, for experimentation
>> and testing by the brave. My own request was for that version of the lock to
>> prevent possible starvation but improve performance by unlocking all shared
>> at once, then doing all exclusives one at a time next, etc.
>
> That doesn't prevent starvation in general, although it will for some
> workloads.

I'm pretty sure it would, it would guarantee that you alternate between
shared and exclusive. Although if the implementation lets shared lockers cut
in line at the wrong time it would not be.

>
> Anyway, it seems rather pointless to add a config parameter that isn't
> at all safe, and adds overhead to a critical part of the system for
> people who don't use it. After all, if you find that it helps, what
> are you going to do? Turn it on in production? I just don't see how
> this is any good other than as a thought-experiment.

The safety is yet to be determined. The overhead is yet to be determined.
You are assuming the worst case for both.
If it turns out that the current implementation can cause starvation
already, which the parallel discussion here indicates, that makes your
starvation concern an issue for both.

>
> At any rate, as I understand it, even after Jignesh eliminated the
> waits, he wasn't able to push his CPU utilization above 48%. Surely
> something's not right there. And he also said that when he added a
> knob to control the behavior, he got a performance improvement even
> when the knob was set to 0, which corresponds to the behavior we have
> already anyway. So I'm very skeptical that there's something wrong
> with either the system or the test. Until that's understood and
> fixed, I don't think that looking at the numbers is worth much.
>

The next bottleneck at 48% CPU is definitely very interesting. However, it
has an explanation: the test blocked on other locks.

The observation about the "old" algorithm with his patch going faster should
be understood to a point, but you don't need to understand everything in
order to show that it is safe or better. There are changes made though that
may explain that. In Jignesh's words:

" still using default logic
(thought different way I compare sequential using fields from the
previous proc structure instead of comparing with constant boolean) "

It is possible that that minor change did some cache locality and/or branch
prediction trick on the processor he has. I've seen plenty of strange
effects caused by tiny changes before. Its expected to find the unexpected.
It will be useful to know what caused the improvement (was it the above?)
but we don't need to know why it changed -- that may be hard to get at
without looking at the assembly code output and being an expert on that
processor/compiler.

One of the trickiest things about locks, is that the little details are VERY
hardware dependant, and the hardware can change the tradeoffs significantly
from generation to generation (e.g. Intel's next x86 chips have a faster
compare and swap operation, and a special instruction for "spinning" that
doesn't spin and allows the "spinner" to not compete for execution resources
with other hardware threads, so spin locks are more viable and all locks and
atomics are faster).

>> I alluded to the three main ways of dealing with lock contention elsewhere.
>> Avoiding locks, making finer grained locks, and making locks faster.
>> All are worthy.  Some are harder to do than others.  Some have been heavily
>> tuned already.  Its a case by case basis.  And regardless, the unfair lock
>> is a good test tool.
>
> In view of the caveats above, I'll give that a firm maybe.
>
> ...Robert
>

My main point here, is that it clearly shows what the 'next' bottleneck is,
so at minimum it can be used to estimate what the impact of lock changes or
avoiding locks may be on various configurations and test scenarios.

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Scott Carey 2009-03-19 20:58:44 Re: Proposal of tunable fix for scalability of 8.4
Previous Message Robert Haas 2009-03-19 20:49:49 Re: Proposal of tunable fix for scalability of 8.4