Re: Proposal of tunable fix for scalability of 8.4

From: Scott Carey <scott(at)richrelevance(dot)com>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>, "Jignesh K(dot) Shah" <J(dot)K(dot)Shah(at)Sun(dot)COM>
Subject: Re: Proposal of tunable fix for scalability of 8.4
Date: 2009-03-12 17:39:05
Message-ID: C5DE96C9.337E%scott@richrelevance.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance


On 3/12/09 8:13 AM, "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov> wrote:

>>> Scott Carey <scott(at)richrelevance(dot)com> wrote:
> "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov> wrote:
>
>> I'm a lot more interested in what's happening between 60 and 180
>> than over 1000, personally. If there was a RAID involved, I'd put
>> it down to better use of the numerous spindles, but when it's all
>> in RAM it makes no sense.
>
> If there is enough lock contention and a common lock case is a short
> lived shared lock, it makes perfect sense sense. Fewer readers are
> blocked waiting on writers at any given time. Readers can 'cut' in
> line ahead of writers within a certain scope (only up to the number
> waiting at the time a shared lock is at the head of the queue).
> Essentially this clumps up shared and exclusive locks into larger
> streaks, and allows for higher shared lock throughput.

You misunderstood me. I wasn't addressing the affects of his change,
but rather the fact that his test shows a linear improvement in TPS up
to 1000 connections for a 64 thread machine which is dealing entirely
with RAM -- no disk access. Where's the bottleneck that allows this
to happen? Without understanding that, his results are meaningless.

-Kevin

They are not meaningless. It is certainly more to understand, but the test is entirely valid without that. In a CPU bound / RAM bound case, as concurrency increases you look for the throughput trend, the %CPU use trend and the context switch rate trend. More information would be useful but the test is validated by the evidence that it is held up by lock contention.

The reasons for not scaling with user count at lower numbers are numerous: network, client limitations, or 'lock locality' (if test user blocks access data in an organized pattern rather than random distribution neighbor clients are more likely to block than non-neighbor ones).
Furthermore, the MOST valid types of tests don't drive each user in an ASAP fashion, but with some pacing to emulate the real world. In this case you expect the user count to significantly be greater than CPU core count before saturation. We need more info about the relationship between "users" and active postgres backends. If each user sleeps for 100 ms between queries (or processes results and writes HTML for 100ms) your assumption that it should take about <CPU core count> users to saturate the CPUs is flawed.

Either way, the result here demonstrates something powerful with respect to CPU scalability and just because 300 clients isn't where it peaks does not mean its invalid, it merely means we don't have enough information to understand the test.

The fact is very simple: Increasing concurrency does not saturate all the CPUs due to lock contention. That can be shown by the results demonstrated without more information.
User count is irrelevant - performance is increasing linearly with user count for quite a while and then peaks and slightly dips. This is the typical curve for all tests with a measured pacing per client.
We want to know more though. More data would help (active postgres backends, %CPU, context switch rate would be my top 3 extra columns in the data set). From there all that we want to know is what the locks are and if that contention is artificial. What tools are available to show what locks are most contended with Postgres? Once the locks are known, we want to know if the locking can be tuned away by one of three general types of strategies: Less locking via smart use of atomics or copy on write (non-blocking strategies, probably fully investigated already); finer grained locks (most definitely investigated); improved performance of locks (looked into for sure, but is highly hardware dependant).

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Rajesh Kumar Mallah 2009-03-12 17:41:02 Re: Entry point for Postgresql Performance
Previous Message Gregory Stark 2009-03-12 17:09:49 Re: Proposal of tunable fix for scalability of 8.4