Skip site navigation (1) Skip section navigation (2)

Re: User concurrency thresholding: where do I look?

From: "Jignesh K(dot) Shah" <J(dot)K(dot)Shah(at)Sun(dot)COM>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-performance(at)postgresql(dot)org, Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Subject: Re: User concurrency thresholding: where do I look?
Date: 2007-07-20 21:24:33
Message-ID: 46A12811.4020205@sun.com (view raw or flat)
Thread:
Lists: pgsql-performance
True you cant switch off the locks since libthread has been folded into 
libc in Solaris 10.

Anyway just to give you an idea of the increase in context switching at 
the break point here are the mpstat (taken at 10 second interval) on 
this 8-socket Sun Fire V890.

The low icsw (Involuntary Context Switches) is about 950-1000 user mark 
after which a context switch storm starts at users above 1000-1050 mark 
and drops in total throughput drops about 30% instantaneously.. I will 
try rebuilding the postgresql with dtrace probes to get more clues. 
(NOTE you will see 1 cpu (cpuid:22) doing more system work... thats the 
one doing handling the network interrupts)


CPU minf mjf xcal  intr ithr  csw icsw migr smtx  srw syscl  usr sys  wt idl
  0   57   0   27   108    6 4072   98 1749  416    1  7763   47  13   0  40
  1   46   0   24    22    6 4198   11 1826  427    0  7547   45  13   0  42
  2   42   0   34   104    8 4103   91 1682  424    1  7797   46  13   0  41
  3   51   0   22    21    6 4125   10 1734  435    0  7399   45  13   0  43
  4   65   0   27    19    6 4015    8 1706  411    0  7292   44  15   0  41
  5   54   0   21    21    6 4297   10 1702  464    0  7708   45  13   0  42
  6   36   0   16    66   47 4218   12 1713  426    0  7685   47  11   0  42
  7   40   0  100   318  206 3699   10 1534  585    0  6851   45  14   0  41
 16   41   0   30    87    5 3780   78 1509  401    1  7604   45  13   0  42
 17   39   0   24    22    5 3970   12 1631  408    0  7265   44  12   0  44
 18   42   0   24    99    5 3829   89 1519  401    1  7343   45  12   0  43
 19   39   0   31 78830    5 3588    8 1509  400    0  6629   43  13   0  44
 20   22   0   20    19    6 3925    9 1577  419    0  7364   44  12   0  44
 21   38   0   31    23    5 3792   13 1566  407    0  7133   45  12   0  44
 22    8   0  110  7053 7045 1641    8  728  838    0  2917   16  50   0  33
 23   62   0   29    21    5 3985   10 1579  449    0  7368   44  12   0  44
CPU minf mjf xcal  intr ithr  csw icsw migr smtx  srw syscl  usr sys  wt idl
  0   13   0   27   123    6 4228  113 1820  433    1  8084   49  13   0  38
  1   16   0   63    26    6 4253   15 1875  420    0  7754   47  14   0  39
  2   11   0   31   110    8 4178   97 1741  425    1  8095   48  14   0  38
  3    8   0   24    20    6 4257    9 1818  444    0  7807   47  13   0  40
  4   13   0   54    28    6 4145   17 1774  426    1  7732   46  16   0  38
  5   12   0   35    23    6 4412   12 1775  447    0  8249   48  13   0  39
  6    8   0   24    38   15 4323   14 1760  422    0  8016   49  11   0  39
  7    8   0  120   323  206 3801   15 1599  635    0  7290   47  15   0  38
 16   11   0   44   107    5 3896   98 1582  393    1  7997   47  15   0  39
 17   15   0   29    24    5 4120   14 1716  416    0  7648   46  13   0  41
 18    9   0   35   113    5 3933  103 1594  399    1  7714   47  13   0  40
 19    8   0   34 83271    5 3702   12 1564  403    0  7010   45  14   0  41
 20    7   0   28    27    6 3997   16 1624  400    0  7676   46  13   0  41
 21    8   0   28    25    5 3997   15 1664  402    0  7658   47  12   0  41
 22    4   0   97  7741 7731 1586   11  704  906    0  2933   17  51   0  32
 23   13   0   28    25    5 4144   15 1658  437    0  7810   47  12   0  41
CPU minf mjf xcal  intr ithr  csw icsw migr smtx  srw syscl  usr sys  wt idl
  0    0   0  141   315    6 9262  301 2812  330    0 10905   49  16   0  35
  1    1   0  153   199    6 9400  186 2808  312    0 11066   48  16   0  37
  2    0   0  140   256    8 8798  242 2592  310    0 10111   47  15   0  38
  3    1   0  141   189    6 8803  172 2592  314    0 10171   47  15   0  39
  4    0   0  120   214    6 9540  207 2801  322    0 10531   46  17   0  36
  5    1   0  152   180    6 8764  161 2564  342    0  9904   47  15   0  38
  6    1   0  107   344  148 8180  181 2512  290    0  9314   51  14   0  35
  7    0   0  665   443  204 8733  153 2574  404    0  9892   43  21   0  37
 16    0   0  113   217    5 6446  201 1975  265    0  7552   45  12   0  44
 17    0   0  107   153    5 6568  140 2021  274    0  7586   44  11   0  45
 18    0   0  121   215    5 6072  201 1789  276    1  7690   44  12   0  44
 19    1   0  102 47142    5 6123  126 1829  262    0  7185   43  12   0  45
 20    0   0  102   143    6 6451  129 1939  262    0  7450   43  13   0  44
 21    1   0  106   150    5 6538  133 1997  285    0  7425   44  11   0  44
 22    0   0  494  5949 5876 3586   73 1040  399    0  4058   26  39   0  34
 23    0   0  102   159    5 6393  142 1942  324    0  7226   43  12   0  46
CPU minf mjf xcal  intr ithr  csw icsw migr smtx  srw syscl  usr sys  wt idl
  0    0   0  217   441    7 10763  426 3234  363    0 12449   47  18   
0  35
  1    0   0  210   322    7 11113  309 3273  351    0 12527   46  17   
0  37
  2    1   0  212   387    8 10306  370 2977  354    0 11320   45  16   
0  38
  3    0   0  230   276    7 10332  257 2947  341    0 11901   43  16   
0  40
  4    0   0  234   306    7 11324  290 3265  352    0 12805   45  18   
0  37
  5    0   0  212   284    7 10590  262 3042  388    0 11789   44  17   
0  39
  6    1   0  154   307   48 9583  241 2903  324    0 10564   50  15   0  35
  7    0   0  840   535  206 10354  247 3035  428    0 11700   42  22   
0  37
 16    0   0  169   303    5 7446  286 2250  290    0  8361   42  13   0  45
 17    0   0  173   240    5 7640  225 2288  295    0  8674   41  13   0  47
 18    0   0  170   289    5 7445  270 2108  286    0  8167   41  12   0  47
 19    0   0  176 51118    5 7365  197 2138  288    0  7934   40  13   0  47
 20    1   0  172   222    6 7835  204 2323  298    0  8759   40  14   0  46
 21    0   0  167   233    5 7749  218 2339  326    0  8264   42  13   0  46
 22    0   0  749  6612 6516 4173   97 1166  421    0  4741   23  44   0  33
 23    0   0  181   239    6 7709  219 2258  383    0  8402   41  12   0  47
CPU minf mjf xcal  intr ithr  csw icsw migr smtx  srw syscl  usr sys  wt idl
  0    0   0  198   439    6 10364  417 3113  327    0 11962   49  17   
0  34
  1    0   0  210   299    6 10655  282 3135  346    0 12463   47  17   
0  36
  2    0   0  202   352    8 9960  332 2890  320    0 11261   47  16   0  37
  3    0   0  182   276    6 9950  255 2857  334    0 11021   46  16   0  38
  4    0   0  200   305    6 10841  286 3127  325    0 12440   48  18   
0  35
  5    0   0  240   286    6 9983  272 2912  358    0 11450   46  16   0  37
  6    0   0  153   323   81 9062  233 2767  300    0  9675   49  18   0  33
  7    0   0  850   556  206 10027  271 2910  415    0 11048   43  22   
0  35
 16    0   0  152   306    5 7261  291 2216  266    0  8055   44  12   0  44
 17    0   0  151   236    5 7193  217 2170  283    0  8099   43  12   0  45
 18    0   0  170   263    5 7008  246 2009  254    0  7836   43  12   0  46
 19    0   0  165 47738    5 6824  197 1989  273    0  7663   42  12   0  46
 20    0   0  188   217    6 7496  197 2222  280    0  8435   43  13   0  44
 21    0   0  179   248    5 7352  234 2233  309    0  8237   43  12   0  44
 22    0   0  813  6041 5963 4006   82 1125  448    0  4442   25  42   0  33
 23    0   0  162   241    5 7364  225 2170  355    0  7720   43  11   0  45




Tom Lane wrote:
> "Jignesh K. Shah" <J(dot)K(dot)Shah(at)Sun(dot)COM> writes:
>   
>> What its saying is that there are holds/waits in trying to get locks 
>> which are locked at Solaris user library levels called from the  
>> postgresql functions:
>> For example both the following functions are hitting on the same mutex 
>> lock  0x10059e280  in Solaris Library call:
>> postgres`AllocSetDelete+0x98
>> postgres`AllocSetAlloc+0x1c4
>>     
>
> That's a perfect example of the sort of useless overhead that I was
> complaining of just now in pgsql-patches.  Having malloc/free use
> an internal mutex is necessary in multi-threaded programs, but the
> backend isn't multi-threaded.  And yet, apparently you can't turn
> that off in Solaris.
>
> (Fortunately, the palloc layer is probably insulating us from malloc's
> performance enough that this isn't a huge deal.  But it's annoying.)
>
> 			regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
>        subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
>        message can get through to the mailing list cleanly
>   

In response to

pgsql-performance by date

Next:From: Josh BerkusDate: 2007-07-20 23:26:07
Subject: Re: Postgres configuration for 64 CPUs, 128 GB RAM...
Previous:From: Tom LaneDate: 2007-07-20 20:57:34
Subject: Re: User concurrency thresholding: where do I look?

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group