Re: Proposal of tunable fix for scalability of 8.4

From: "Jignesh K(dot) Shah" <J(dot)K(dot)Shah(at)Sun(dot)COM>
To: "Jignesh K(dot) Shah" <J(dot)K(dot)Shah(at)Sun(dot)COM>
Cc: Scott Carey <scott(at)richrelevance(dot)com>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Proposal of tunable fix for scalability of 8.4
Date: 2009-03-13 14:56:45
Message-ID: 49BA742D.3050203@sun.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance


>>
>>
>> In general, I suggest that it is useful to run tests with a few
>> different types of pacing. Zero delay pacing will not have realistic
>> number of connections, but will expose bottlenecks that are
>> universal, and less controversial. Small latency (100ms to 1s) tests
>> are easy to make from the zero delay ones, and help expose problems
>> with connection count or other forms of ‘non-active’ concurrency.
>> End-user realistic delays are app specific, and useful with larger
>> holistic load tests (say, through the application interface).
>> Generally, running them in this order helps because at each stage you
>> are adding complexity. Based on your explanations, you’ve probably
>> done much of this so far and your approach sounds solid to me.
>> If the first case fails (zero delay, smaller user count), there is no
>> way the others will pass.
>>
>>
>
> I think I have done that before so I can do that again by running the
> users at 0 think time which will represent a "Connection pool" which
> is highly utilized" and test how big the connection pool can be before
> the throughput tanks.. This can be useful for App Servers which sets
> up connections pools of their own talking with PostgreSQL.
>
> -Jignesh
>
>
So I backed out my change and used the stock 8.4 snapshot that I had
downloaded.. With now 0 think time I do runs with lot less users..
still I cannot get it to go to 100% CPU
60: 8: Medium Throughput: 7761.000 Avg Medium Resp: 0.004
120: 16: Medium Throughput: 16876.000 Avg Medium Resp: 0.004
180: 24: Medium Throughput: 25359.000 Avg Medium Resp: 0.004
240: 32: Medium Throughput: 33104.000 Avg Medium Resp: 0.005
300: 40: Medium Throughput: 42200.000 Avg Medium Resp: 0.005
360: 48: Medium Throughput: 49996.000 Avg Medium Resp: 0.005
420: 56: Medium Throughput: 58260.000 Avg Medium Resp: 0.005
480: 64: Medium Throughput: 66289.000 Avg Medium Resp: 0.005
540: 72: Medium Throughput: 74667.000 Avg Medium Resp: 0.005
600: 80: Medium Throughput: 82632.000 Avg Medium Resp: 0.005
660: 88: Medium Throughput: 90211.000 Avg Medium Resp: 0.006
720: 96: Medium Throughput: 98236.000 Avg Medium Resp: 0.006
780: 104: Medium Throughput: 105517.000 Avg Medium Resp: 0.006
840: 112: Medium Throughput: 112921.000 Avg Medium Resp: 0.006
900: 120: Medium Throughput: 118256.000 Avg Medium Resp: 0.007
960: 128: Medium Throughput: 126499.000 Avg Medium Resp: 0.007
1020: 136: Medium Throughput: 133354.000 Avg Medium Resp: 0.007
1080: 144: Medium Throughput: 135826.000 Avg Medium Resp: 0.008
1140: 152: Medium Throughput: 121729.000 Avg Medium Resp: 0.012
1200: 160: Medium Throughput: 130487.000 Avg Medium Resp: 0.011
1260: 168: Medium Throughput: 123368.000 Avg Medium Resp: 0.013
1320: 176: Medium Throughput: 134649.000 Avg Medium Resp: 0.012
1380: 184: Medium Throughput: 136272.000 Avg Medium Resp: 0.013

Vmstat shows that CPUS are hardly busy in the 64-cpu system (CPUS are
reported busy when there is active process assigned to the cpu)
-bash-3.2$ vmstat 30
kthr memory page disk faults cpu
r b w swap free re mf pi po fr de sr s0 s1 s2 sd in sy cs us
sy id
19 0 0 52691088 46220848 27 302 10 68 68 0 3 1 -0 -0 -0 13411 20762
26854 5 3 92
0 0 0 45095664 39898296 0 455 0 0 0 0 0 0 0 0 0 698 674 295
0 0 100
0 0 0 45040640 39867056 5 13 0 0 0 0 0 0 0 0 0 3925 4189 5721
0 0 99
0 0 0 45038856 39864016 0 5 0 0 0 0 0 0 0 0 0 9479 8643 15205
1 1 98
0 0 0 45037760 39862552 0 14 0 0 0 0 0 0 0 0 0 12088 9041 19890
2 1 98
0 0 0 45035960 39860080 0 6 0 0 0 0 0 0 0 0 0 16590 11611 28351
2 1 97
0 0 0 45034648 39858416 0 17 0 0 0 0 0 0 0 0 0 19192 13027 33218
3 1 96
0 0 0 45032360 39855464 0 10 0 0 0 0 0 0 0 0 0 22795 16467 40392
4 1 95
0 0 0 45030840 39853568 0 22 0 0 0 0 0 0 0 0 0 25349 18315 45178
4 1 94
0 0 0 45027456 39849648 0 10 0 0 0 0 0 0 0 0 0 28158 22500 50804
5 2 93
0 0 0 45000752 39832608 0 38 0 0 0 0 0 0 0 0 0 31332 25744 56751
6 2 92
0 0 0 45010120 39836728 0 6 0 0 0 0 0 0 0 0 0 36636 29334 66505
7 2 91
0 0 0 45017072 39838504 0 29 0 0 0 0 0 0 0 0 0 38553 32313 70915
7 2 91
0 0 0 45011384 39833768 0 11 0 0 0 0 0 0 0 0 0 41186 35949 76275
8 3 90
0 0 0 44890552 39826136 0 40 0 0 0 0 0 0 0 0 0 45123 44507 83665
9 3 88
0 0 0 44882808 39822048 0 6 0 0 0 0 0 0 0 0 0 49342 53431 91783
10 3 87
0 0 0 45003328 39825336 0 42 0 0 0 0 0 0 0 0 0 48516 42515 91135
10 3 87
0 0 0 44999688 39821008 0 6 0 0 0 0 0 0 0 0 0 54695 48741
102526 11 3 85
kthr memory page disk faults cpu
r b w swap free re mf pi po fr de sr s0 s1 s2 sd in sy cs us
sy id
0 0 0 44980744 39806400 0 55 0 0 0 0 0 0 0 0 0 54968 51946
103245 12 4 84
0 0 0 44992288 39812256 0 6 0 1 1 0 0 0 0 0 0 60506 58205
113911 13 4 83
0 0 0 44875648 39802128 1 60 0 0 0 0 0 1 0 0 0 60485 66576
114081 13 4 83
0 0 0 44848792 39795008 0 8 0 0 0 0 0 1 0 0 0 66760 75060
126202 15 5 80
0 0 0 44837168 39786432 0 57 0 0 0 0 0 0 0 0 0 66015 68256
125209 15 4 81
1 0 0 44832680 39779064 0 7 0 0 0 0 0 0 0 0 0 72728 79089
138077 17 5 79
1 0 0 44926640 39773160 0 69 0 0 0 0 0 0 0 0 0 71990 79148
136786 17 5 78
1 0 0 44960800 39781416 0 6 0 0 0 0 0 0 0 0 0 75442 77829
143783 18 5 77
1 0 0 44846472 39773960 0 68 0 0 0 0 0 0 0 0 0 80395 97964
153336 19 6 75
1 0 0 44887168 39770680 0 7 0 0 0 0 0 0 0 0 0 80010 88144
152699 19 6 75
1 0 0 44951152 39769576 0 68 0 0 0 0 0 0 0 0 0 83670 85394
159745 20 6 74
1 0 0 44946080 39763120 0 7 0 0 0 0 0 0 0 0 0 85416 91961
163147 21 6 73
1 0 0 44923928 39744640 0 83 0 0 0 0 0 0 0 0 0 87625 104894
167412 22 6 71
1 0 0 44929704 39745368 0 7 0 0 0 0 0 0 0 0 0 93280 103922
178357 24 7 69
1 0 0 44822712 39738744 0 82 0 0 0 0 0 0 0 0 0 91739 113747
175232 23 7 70
1 0 0 44790040 39730168 0 6 0 0 0 0 0 0 0 0 0 96159 122496
183642 25 7 68
1 0 0 44868808 39733872 0 82 0 0 0 0 0 0 0 0 0 96166 107465
183502 25 7 68
2 0 0 44913296 39730272 0 6 0 0 0 0 0 0 0 0 0 103573 114064
197502 27 8 65
1 0 0 44890768 39712424 0 96 0 0 0 0 0 0 0 0 0 102235 123767
194747 28 8 64
kthr memory page disk faults cpu
r b w swap free re mf pi po fr de sr s0 s1 s2 sd in sy cs us
sy id
2 0 0 44900096 39716808 0 6 0 0 0 0 0 0 0 0 0 97323 112955
185647 27 8 65
1 0 0 44793360 39708336 0 94 0 0 0 0 0 0 0 0 0 98631 131539
188076 27 8 65
2 0 0 44765136 39700536 0 8 0 0 0 0 0 0 0 0 0 90489 117037
172603 27 8 66
1 0 0 44887392 39700024 0 94 0 0 0 0 0 0 0 0 0 95832 106992
182677 27 8 65
2 0 0 44881856 39692632 0 6 0 0 0 0 0 0 0 0 0 95015 109679
181194 27 8 65
1 0 0 44860928 39674856 0 110 0 0 0 0 0 0 0 0 0 92909 119383
177459 27 8 65
1 0 0 44861320 39671704 0 8 0 0 0 0 0 0 0 0 0 94677 110967
180832 28 8 64
1 0 0 44774424 39676000 0 108 0 0 0 0 0 0 0 0 0 94953 123457
181397 27 8 65
1 0 0 44733000 39668528 0 6 0 0 0 0 0 0 0 0 0 100719 132038
192550 29 9 63
1 0 0 44841888 39668864 0 106 0 0 0 0 0 0 0 0 0 97293 109177
185589 28 8 64
1 0 0 44858976 39663592 0 6 0 0 0 0 0 0 0 0 0 103199 118256
197049 30 9 62
1 0 0 44837216 39646416 0 122 0 0 0 0 0 0 0 0 0 105637 133127
201788 31 9 60
1 0 0 44842624 39647232 0 8 0 0 0 0 0 0 0 0 0 110530 131454
211139 32 9 59
2 0 0 44740624 39638832 1 127 0 0 0 0 0 0 0 0 0 111114 145135
212398 32 9 59
2 0 0 44690824 39628568 0 8 0 0 0 0 0 0 0 0 0 109934 146164
210454 32 10 59
2 0 0 44691912 39616000 0 132 0 0 0 0 0 0 0 0 0 108231 132279
206885 32 9 59
1 0 0 44797968 39609832 0 9 0 0 0 0 0 0 0 0 0 111582 135125
213446 33 10 58
3 0 0 44781632 39598432 0 135 0 0 0 0 0 0 0 0 0 115277 150356
220792 34 10 56
5 0 0 44791408 39600432 0 10 0 0 0 0 0 0 0 0 0 111428 137996
212559 33 9 58
kthr memory page disk faults cpu
r b w swap free re mf pi po fr de sr s0 s1 s2 sd in sy cs us
sy id
3 0 0 44710008 39603320 0 135 0 0 0 0 0 0 0 0 0 110564 145678
211567 33 10 57
5 0 0 44663368 39595008 0 6 0 0 0 0 0 0 0 0 0 108891 143083
208389 33 10 58
3 0 0 44753496 39593824 0 132 0 0 0 0 0 0 0 0 0 109922 126865
209869 33 9 57
4 0 0 44788368 39588528 0 7 0 0 0 0 0 0 0 0 0 108680 129073
208068 33 10 57
2 0 0 44767920 39570592 0 147 0 0 0 0 0 0 0 0 0 106671 142403
204724 33 10 58
4 0 0 44762656 39563256 0 11 0 0 0 0 0 0 0 0 0 106185 130328
204551 34 10 57
2 0 0 44674584 39560912 0 148 0 0 0 0 0 0 0 0 0 104757 139147
201448 32 10 58
1 0 0 44619824 39551024 0 9 0 0 0 0 0 0 0 0 0 103653 142125
199896 32 10 58
2 0 0 44622480 39552432 0 141 0 0 0 0 0 0 0 0 0 101373 134547
195553 32 9 58
1 0 0 44739936 39552312 0 11 0 0 0 0 0 0 0 0 0 102932 121742
198205 33 9 58

And lock stats are as follows at about 280 users sampling for a single
backend process:
# ./84_lwlock.d 29405

Lock Id Mode State Count
WALWriteLock Exclusive Acquired 1
XidGenLock Exclusive Waiting 1
CLogControlLock Shared Waiting 3
ProcArrayLock Shared Waiting 7
CLogControlLock Exclusive Waiting 9
WALInsertLock Exclusive Waiting 45
CLogControlLock Shared Acquired 52
ProcArrayLock Exclusive Waiting 61
XidGenLock Exclusive Acquired 96
ProcArrayLock Exclusive Acquired 97
CLogControlLock Exclusive Acquired 152
WALInsertLock Exclusive Acquired 302
ProcArrayLock Shared Acquired 729
FirstLockMgrLock Shared Acquired 812
FirstBufMappingLock Shared Acquired 857
SInvalReadLock Shared Acquired 1551

Lock Id Mode State Combined Time (ns)
WALInsertLock Acquired 89909
XidGenLock Exclusive Waiting 101488
WALWriteLock Exclusive Acquired 140563
CLogControlLock Shared Waiting 354756
FirstBufMappingLock Acquired 471438
FirstLockMgrLock Acquired 2907141
XidGenLock Exclusive Acquired 7450934
CLogControlLock Exclusive Waiting 11094716
ProcArrayLock Acquired 15495229
WALInsertLock Exclusive Waiting 20801169
CLogControlLock Exclusive Acquired 21339264
SInvalReadLock Acquired 24309991
FirstLockMgrLock Exclusive Acquired 39904071
FirstBufMappingLock Exclusive Acquired 40826435
ProcArrayLock Shared Waiting 86352947
WALInsertLock Exclusive Acquired 89336432
SInvalReadLock Exclusive Acquired 252574515
ProcArrayLock Exclusive Acquired 315064347
ProcArrayLock Exclusive Waiting 847806215

mpstat outputs is too much so I am doing aggegation by procesor set
which is all 64 cpus

-bash-3.2$ mpstat -a 10

SET minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr
sys wt idl sze
0 370 0 118649 127575 7595 244456 43931 62166 8700 0 158929
38 11 0 50 64
0 167 0 119668 128704 7644 246389 43287 62357 8816 0 161006
38 11 0 51 64
0 27 0 109461 117433 6997 224514 38562 56446 8171 0 148322
34 10 0 56 64
0 2 0 122368 131549 7871 250237 39620 61478 9082 0 165995
36 11 0 52 64
0 0 0 122025 131380 7973 249429 37292 59863 8922 0 166319
35 11 0 54 64

(quick overview of columns )
SET Processor set
minf minor faults
mjf major faults
xcal inter-processor cross-calls
intr interrupts
ithr interrupts as threads (not counting clock
interrupt)
csw context switches
icsw involuntary context switches
migr thread migrations (to another processor)
smtx spins on mutexes (lock not acquired on first
try)
srw spins on readers/writer locks (lock not
acquired on first try)
syscl system calls
usr percent user time
sys percent system time
wt the I/O wait time is no longer calculated as a
percentage of CPU time, and this statistic
will always return zero.
idl percent idle time
sze number of processors in the requested proces-
sor set

-Jignesh

--
Jignesh Shah http://blogs.sun.com/jkshah
The New Sun Microsystems,Inc http://sun.com/postgresql

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Alan Stange 2009-03-13 15:20:04 Re: 8.4 Performance improvements: was Re: Proposal of tunable fix for scalability of 8.4
Previous Message Gregory Stark 2009-03-13 13:57:06 Re: 8.4 Performance improvements: was Re: Proposal of tunable fix for scalability of 8.4