Re: Proposal of tunable fix for scalability of 8.4

From: Scott Carey <scott(at)richrelevance(dot)com>
To: "Jignesh K(dot) Shah" <J(dot)K(dot)Shah(at)Sun(dot)COM>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Proposal of tunable fix for scalability of 8.4
Date: 2009-03-12 22:57:05
Message-ID: C5DEE151.33BA%scott@richrelevance.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 3/12/09 11:37 AM, "Jignesh K. Shah" <J(dot)K(dot)Shah(at)Sun(dot)COM> wrote:

And again this is the third time I am saying.. the test users also have some latency build up in them which is what generally is exploited to get more users than number of CPUS on the system but that's the point we want to exploit.. Otherwise if all new users begin to do their job with no latency then we would need 6+ billion cpus to handle all possible users. Typically as an administrator (System and database) I can only tweak/control latencies within my domain, that is network, disk, cpu's etc and those are what I am tweaking and coming to a *Configured* environment and now trying to improve lock contentions/waits in PostgreSQL so that we have an optimized setup.

In general, I suggest that it is useful to run tests with a few different types of pacing. Zero delay pacing will not have realistic number of connections, but will expose bottlenecks that are universal, and less controversial. Small latency (100ms to 1s) tests are easy to make from the zero delay ones, and help expose problems with connection count or other forms of 'non-active' concurrency. End-user realistic delays are app specific, and useful with larger holistic load tests (say, through the application interface). Generally, running them in this order helps because at each stage you are adding complexity. Based on your explanations, you've probably done much of this so far and your approach sounds solid to me.
If the first case fails (zero delay, smaller user count), there is no way the others will pass.

I am trying another run where I limit the waked up threads to a pre-configured number to see how various numbers pans out in terms of throughput on this server.

Regards,
Jignesh

This would be good, as would waking up only the shared locks, but refining the test somewhat to be maximally convincing would help. The first thing to show is either a test with very small or no sleep delay, or with a connection pooler in between. I prefer the former since it is the most simple. This will be a test that is less entangled with the connection count and should peak at a lot closer to the CPU core count and be more convincing to some. I'm positive it won't change the basic trend (ramp up and plateau, with a higher plateau with the changed lock code) but others seem unconvinced and I'm a nobody anyway.

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Robert Haas 2009-03-13 01:29:52 Re: Proposal of tunable fix for scalability of 8.4
Previous Message Scott Carey 2009-03-12 22:15:51 Re: Proposal of tunable fix for scalability of 8.4