Re: Any better plan for this query?..

From: Dimitri <dimitrik(dot)fr(at)gmail(dot)com>
To: Scott Carey <scott(at)richrelevance(dot)com>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Aidan Van Dyk <aidan(at)highrise(dot)ca>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, PostgreSQL Performance <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Any better plan for this query?..
Date: 2009-05-14 18:34:48
Message-ID: 5482c80a0905141134n61e0085ak4c5f33633facd7b7@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi Scott,

let me now finish my report and regroup all data together, and then
we'll continue discussion as it'll come more in debug/profile phase..
- I'll be not polite from my part to send some tons of attachments to
the mail list :-)

Rgds,
-Dimitri

On 5/13/09, Scott Carey <scott(at)richrelevance(dot)com> wrote:
>
> On 5/13/09 3:22 AM, "Dimitri" <dimitrik(dot)fr(at)gmail(dot)com> wrote:
>
>> Hi Scott,
>>
>> On 5/12/09, Scott Carey <scott(at)richrelevance(dot)com> wrote:
>>> Although nobody wants to support it, he should try the patch that Jignesh
>>> K.
>>> Shah (from Sun) proposed that makes ProcArrayLock lighter-weight. If it
>>> makes 32 cores much faster, then we have a smoking gun.
>>>
>>> Although everyone here is talking about this as an 'unoptimal' solution,
>>> the
>>> fact is there is no evidence that a connection pooler will fix the
>>> scalability from 16 > 32 cores.
>>> Certainly a connection pooler will help most results, but it may not fix
>>> the
>>> scalability problem.
>>>
>>> A question for Dimitri:
>>> What is the scalability from 16 > 32 cores at the 'peak' load that occurs
>>> near 2x the CPU count? Is it also poor? If this is also poor, IMO the
>>> community here should not be complaining about this unopimal case -- a
>>> connection pooler at that stage does little and prepared statements will
>>> increase throughput but not likely alter scalability.
>>
>> I'm attaching a small graph showing a TPS level on PG 8.4 depending on
>> number of cores (X-axis is a number of concurrent users, Y-axis is the
>> TPS number). As you may see TPS increase is near linear while moving
>> from 8 to 16 cores, while on 32cores even it's growing slightly
>> differently, what is unclear is why TPS level is staying limited to
>> 11.000 TPS on 32cores. And it's pure read-only workload.
>>
>
> Interesting. What hardware is this, btw? Looks like the 32 core system
> probably has 2x the CPU and a bit less interconnect efficiency versus the 16
> core one (which would be typical).
> Is the 16 core case the same, but with fewer cores per processor active? Or
> fewer processors total?
> Understanding the scaling difference may require a better understanding of
> the other differences besides core count.
>
>>>
>>> If that result scales, then the short term answer is a connection pooler.
>>>
>>> In the tests that Jingesh ran -- making the ProcArrayLock faster helped
>>> the
>>> case where connections = 2x the CPU core count quite a bit.
>>>
>>> The thread about the CPU scalability is "Proposal of tunable fix for
>>> scalability of 8.4", originally posted by "Jignesh K. Shah"
>>> <J(dot)K(dot)Shah(at)Sun(dot)COM>, March 11 2009.
>>>
>>> It would be very useful to see results of this benchmark with:
>>> 1. A Connection Pooler
>>
>> will not help, as each client is *not* disconnecting/reconnecting
>> during the test, as well PG is keeping well even 256 users. And TPS
>> limit is reached already on 64 users, don't think pooler will help
>> here.
>>
>
> Actually, it might help a little. Postgres has a flaw that makes backends
> block on a lock briefly based on the number of total backends -- active or
> completely passive. Your tool has some (very small) user-side delay and a
> connection pooler would probably allow 64 of your users to efficiently 'fit'
> in 48 or so connection pooler slots.
>
> It is not about connecting and disconnecting in this case, its about
> minimizing Postgres' process count. If this does help, it would hint at
> certain bottlenecks. If it doesn't it would point elsewhere (and quiet some
> critics).
>
> However, its unrealistic for any process-per-connection system to have less
> backends than about 2x the core count -- else any waiting on I/O or network
> will just starve CPU. So this would just be done for research, not a real
> answer to making it scale better.
>
> For those who say "but, what if its I/O bound! You don't need more
> backends then!": Well you don't need more CPU either if you're I/O bound.
> By definition, CPU scaling tests imply the I/O can keep up.
>
>
>>> 2. Jignesh's patch
>>
>> I've already tested it and it did not help in my case because the real
>> problem is elsewhere.. (however, I did not test it yet with my latest
>> config params)
>>
>
> Great to hear that! -- That means this case is probably not ProcArrayLock.
> If its Solaris, could we get:
> 1. What is the CPU stats when it is in the inefficient state near 64 or 128
> concurrent users (vmstat, etc. I'm interested in CPU in
> user/system/idle/wait time, and context switches/sec mostly).
> 2. A Dtrace probe on the postgres locks -- we might be able to identify
> something here.
>
> The results here would be useful -- if its an expected condition in the
> planner or parser, it would be useful confirmation. If its something
> unexpected and easy to fix -- it might be changed relatively soon.
>
> If its not easy to detect, it could be many other things -- but the process
> above at least rules some things out and better characterizes the state.
>
>>> 3. Prepared statements
>>>
>>
>> yes, I'm preparing this test.
>>
>>> #3 is important, because prepared statements are ideal for queries that
>>> perform well with low statistics_targets, and not ideal for those that
>>> require high statistics targets. Realistically, an app won't have more
>>> than
>>> a couple dozen statement forms to prepare. Setting the default
>>> statistics
>>> target to 5 is just a way to make some other query perform bad.
>>
>> Agree, but as you may have a different statistic target *per* table it
>> should not be a problem. What is sure - all time spent on parse and
>> planner will be removed here, and the final time should be a pure
>> execution.
>>
>
> I'm definitely interested here because although pure execution will
> certainly be faster, it may not scale any better.
>
>
>> Rgds,
>> -Dimitri
>>
>
>

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Craig James 2009-05-15 01:08:06 Re: superlative missuse
Previous Message Dimitri 2009-05-14 18:28:23 Re: Any better plan for this query?..