Re: Proposal of tunable fix for scalability of 8.4

From: Scott Carey <scott(at)richrelevance(dot)com>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Greg Smith <gsmith(at)gregsmith(dot)com>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>, "Jignesh K(dot) Shah" <J(dot)K(dot)Shah(at)sun(dot)com>
Subject: Re: Proposal of tunable fix for scalability of 8.4
Date: 2009-03-13 18:38:57
Message-ID: C5DFF651.3439%scott@richrelevance.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Its an interesting question, but the answer is most likely simply that the client can't keep up. And in the real world, no matter how incredible your connection pool is, there will be some inefficiency, there will be some network delay, there will be some client side time, etc.

I'm still not sure if we are dealing with a 64 or 128 thread machine too.

The average query finishes in 6ms according to the result., so any bit of network latency will multiply the number of connections needed to saturate, and any small delay in the client between queries, or going through a result set, will make it hard to have a 100% duty cycle.

The test result with zero delay stopped linear increase in performance at about 128 users and 7ms average query response time, at ~2100 queries per second. If this is a 128 thread machine, then that means the clients are pretty fast. If its a 64 thread machine, it means the clients can provide about a 50% duty cycle time, which is not horrible.
This is 16.5 queries per second per client, or an average time per (query plus client delay) of 1/16.5 = ~6ms.
That is to say, either this is a 128 thread machine, or the test harness is measuring average response time and including client side delay and thus there is a 50% duty cycle time and ~3ms client delay per request.

What would really help is a counter that tracks active postgres connection count so one can look at that compared to the total connection count. Idle count and idle in transaction count would also be hugely useful to be able to track as a dynamic statistic or counter for load testing. For all of these, an average value over the last second or so is much better than an instantaneous count for these purposes.

On 3/13/09 11:02 AM, "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov> wrote:

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> I think that changing the locking behavior is attacking the problem
>> at the wrong level anyway.
>
> Right. By the time a patch here could have any effect, you've
> already lost the game --- having to deschedule and reschedule a
> process is a large cost compared to the typical lock hold time for
> most LWLocks. So it would be better to look at how to avoid
> blocking in the first place.

That's what motivated my request for a profile of the "80 clients with
zero wait" case. If all data access is in RAM, why can't 80 processes
keep 64 threads (on 8 processors) busy? Does anybody else think
that's an interesting question, or am I off in left field here?

-Kevin

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Jignesh K. Shah 2009-03-13 18:48:33 Re: Proposal of tunable fix for scalability of 8.4
Previous Message Kevin Grittner 2009-03-13 18:02:24 Re: Proposal of tunable fix for scalability of 8.4