From: pgsql-performance-owner(at)postgresql(dot)org [pgsql-performance-owner(at)postgresql(dot)org] On Behalf Of Simon Riggs [simon(at)2ndQuadrant(dot)com]
Sent: Wednesday, March 18, 2009 12:53 AM
To: Matthew Wakeling
Subject: Re: [PERFORM] Proposal of tunable fix for scalability of 8.4
> On Mon, 2009-03-16 at 16:26 +0000, Matthew Wakeling wrote:
> > One possibility would be for the locks to alternate between exclusive
> > and
> > shared - that is:
> > 1. Take a snapshot of all shared waits, and grant them all -
> > thundering
> > herd style.
> > 2. Wait until ALL of them have finished, granting no more.
> > 3. Take a snapshot of all exclusive waits, and grant them all, one by
> > one.
> > 4. Wait until all of them have been finished, granting no more.
> > 5. Back to (1)
> I agree with that, apart from the "granting no more" bit.
> Currently we queue up exclusive locks, but there is no need to since for
> ProcArrayLock commits are all changing different data.
> The most useful behaviour is just to have two modes:
> * exclusive-lock held - all other x locks welcome, s locks queue
> * shared-lock held - all other s locks welcome, x locks queue
> This *only* works for ProcArrayLock.
> Simon Riggs www.2ndQuadrant.com
> PostgreSQL Training, Services and Support
I want to comment on an important distinction between these two variants. The "granting no more" bit WILL decrease performance under high contention. Here is my reasoning.
We have two "two lists" proposals.
Type A: allow line cutting (Simon, above):
* exclusive-lock held and all exclusives process - all other NEW x locks welcome, s locks queue
* shared-lock held and all shareds process- all other NEW s locks welcome, x locks queue
Type B: forbid line cutting (Matthew, above, modified to allow multiple exclusive for ProcArrayLock --
for other types exclusive would be one at a time)
* exclusive-lock held and all exclusives process - all NEW lock requests queue
* shared-lock held and shareds process - all NEW lock requests queue
A big benefit of the "wake all" proposal, is that a lot of access does not have to context switch out and back in. On a quick assessment, the type A above would lock and context switch even less than the wake-all (since exclusives don't go one at a time) but otherwise be similar. But this won't matter much if it is shared lock dominated.
I would LOVE to have seen context switch rate numbers with the results so far, but many base unix tools don't show it by default (can get it from sar, rstat reports it) average # of context switches per transaction is an awesome measure of lock contention and lock efficiency.
In type A above, the ratio of requests that require a context switch is Q / (M + Q), where Q is the average queue size when the 'shared-exclusive' swap occrs and M is the average number of "line cutters".
In type B, the ratio of requests that must context switch is always == 1. Every request must queue and wait! This may perform worse than the current lock!
One way to guarantee some fairness is to compromise between the two.
Lets call this proposal C. Unfortunately, this is less elegant than the other two, since it has logic for both. It could be made tunable to be the complete spectrum though.
* exclusive-lock held and all exclusives process - first N new X requests welcome, N+1 and later X requests and all shared locks queue.
* shared-lock held and shareds process - first N new S requests welcom, N+1 and later S requests and all X locks queue
So, if shared locks are queuing and exclusive hold the lock and are operating, and another exclusive request arrives, it can cut in line only if it is one of the first N to do so before it will queue and wait and give shared locks their turn.
This counting condition can be done with an atomically incrementing integer using compare and set operations and no locks, and under heavy contention will reduce the number of context switches per operation to Q/(N + Q) where N is the number of 'line cutters' achieved and Q is the average queue size when the queued items are unlocked. Note how this is the same as the 'unbounded' equation with M above, except that N can never be greater than M (the 'natural' line cut count).
So for N = Q half are forced to context switch and half cut in line without a context switch. N can be tunable, and it can be a different number for shared and exclusive to bias towards one or the other if desired.
In response to
pgsql-performance by date
|Next:||From: Nimesh Satam||Date: 2009-03-20 09:03:58|
|Subject: Re: Prepared statement does not exist|
|Previous:||From: Scott Carey||Date: 2009-03-20 06:01:05|
|Subject: Re: Proposal of tunable fix for scalability of 8.4|