Re: Excessive context switching on SMP Xeons

From: Bill Montgomery <billm(at)lulu(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: Excessive context switching on SMP Xeons
Date: 2004-10-05 21:08:32
Message-ID: 41630D50.3020308@lulu.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Thanks for the helpful response.

Josh Berkus wrote:

> First off, the good news: Gavin Sherry and OSDL may have made some
> progress
>
>on this. We'll be testing as soon as OSDL gets the Scalable Test Platform
>running again. If you have the CS problem (which I don't think you do, see
>below) and a test box, I'd be thrilled to have you test it.
>

I'd be thrilled to test it too, if for no other reason that to determine
whether what I'm experiencing really is the "CS problem".

>1) I don't really consider a CS of 30,000 to 60,000 on Xeon to be excessive.
>People demonstrating the problem on dual or quad Xeon reported CS levels of
>150,000 or more. So you probably don't have this issue at all -- depending
>on the load, your level could be considered "normal".
>

Fair enough. I never see nearly this much context switching on my dual
Xeon boxes running dozens (sometimes hundreds) of concurrent apache
processes, but I'll concede this could just be due to the more parallel
nature of a bunch of independent apache workers.

>>I am experiencing said symptom on two different dual-Xeon boxes, both
>>Dells with ServerWorks chipsets, running the latest RH9 and RHEL3
>>kernels, respectively. The databases are 90% read, 10% write, and are
>>small enough to fit entirely into main memory, between pg shared buffers
>>and kernel buffers.
>>
>
>Ah. Well, you do have the worst possible architecture for PostgreSQL-SMP
>performance. The ServerWorks chipset is badly flawed (the company is now, I
>believe, bankrupt from recalled products) and Xeons have several performance
>issues on databases based on online tests.
>

Hence my desire for recommendations on alternate architectures ;-)

>AthalonMP appears to be less suseptible to the CS bug than Xeon, and the
>effect of the bug is not as severe. However, a quad-Opteron box can be
>built for less than $6000; what's your standard for "expensive"? If you
>don't have that much money, then you may be stuck for options.
>

Being a 24x7x365 shop, and these servers being mission critical, I
require vendors that can offer 24x7 4-hour part replacement, like Dell
or IBM. I haven't seen 4-way 64-bit boxes meeting that requirement for
less than $20,000, and that's for a very minimally configured box. A
suitably configured pair will likely end up costing $50,000 or more. I
would like to avoid an unexpected expense of that size, unless there's
no other good alternative. That said, I'm all ears for a cheaper
alternative that meets my support and performance requirements.

>Overall, though, I'm not convinced that you have the CS bug and I think it's
>more likely that you have a few "bad queries" which are dragging down the
>whole system. Troubleshoot those and your CPU-bound problems may go away.
>

You may be right, but to compare apples to apples, here's some vmstat
output from a pgbench run:

[billm(at)xxx billm]$ pgbench -i -s 20 pgbench
<snip>
[billm(at)xxx billm]$ pgbench -s 20 -t 500 -c 100 pgbench
starting vacuum...end.
transaction type: TPC-B (sort of)
scaling factor: 20
number of clients: 100
number of transactions per client: 500
number of transactions actually processed: 50000/50000
tps = 369.717832 (including connections establishing)
tps = 370.852058 (excluding connections establishing)

and some of the vmstat output...

[billm(at)poe billm]$ vmstat 1
procs memory swap io
system cpu
r b swpd free buff cache si so bi bo in cs us sy
wa id
0 1 0 863108 220620 1571924 0 0 4 64 34 50 1
0 0 98
0 1 0 863092 220620 1571932 0 0 0 3144 171 2037 3
3 47 47
0 1 0 863084 220620 1571956 0 0 0 5840 202 3702 6
3 46 45
1 1 0 862656 220620 1572420 0 0 0 12948 631 42093 69
22 5 5
11 0 0 862188 220620 1572828 0 0 0 12644 531 41330 70
23 2 5
9 0 0 862020 220620 1573076 0 0 0 8396 457 28445 43
17 17 22
9 0 0 861620 220620 1573556 0 0 0 13564 726 44330 72
22 2 5
8 1 0 861248 220620 1573980 0 0 0 12564 660 43667 65
26 2 7
3 1 0 860704 220624 1574236 0 0 0 14588 646 41176 62
25 5 8
0 1 0 860440 220624 1574476 0 0 0 42184 865 31704 44
23 15 18
8 0 0 860320 220624 1574628 0 0 0 10796 403 19971 31
10 29 29
0 1 0 860040 220624 1574884 0 0 0 23588 654 36442 49
20 13 17
0 1 0 859984 220624 1574932 0 0 0 4940 229 3884 5
3 45 46
0 1 0 859940 220624 1575004 0 0 0 12140 355 13454 20
10 35 35
0 1 0 859904 220624 1575044 0 0 0 5044 218 6922 11
5 41 43
1 1 0 859868 220624 1575052 0 0 0 4808 199 2029 3
3 47 48
0 1 0 859720 220624 1575180 0 0 0 21596 485 18075 28
13 29 30
11 1 0 859372 220624 1575532 0 0 0 24520 609 41409 62
33 2 3

While pgbench does not generate quite as high a number of CS as our app,
it is an apples-to-apples comparison, and rules out the possibility of
poorly written queries in our app. Still, 40k CS/sec seems high to me.
While pgbench is just a synthetic benchmark, and not necessarily the
best benchmark, yada yada, 370 tps seems like pretty poor performance.
I've benchmarked the IO subsystem at 70MB/s of random 8k writes, yet
pgbench typically doesn't use more than 10MB/s of that bandwidth (a
little more at checkpoints).

So I guess the question is this: now that I've opened up the IO
bottleneck that exists on most database servers, am I really truly CPU
bound now, and not just suffering from poorly handled spinlocks on my
Xeon/ServerWorks platform? If so, is the expense of a 64-bit system
worth it, or is the price/performance for PostgreSQL still better on an
alternative 32-bit platform, like AthlonMP?

Best Regards,

Bill Montgomery

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Josh Berkus 2004-10-05 22:38:51 Re: Excessive context switching on SMP Xeons
Previous Message Gaetano Mendola 2004-10-05 21:08:23 Re: Excessive context switching on SMP Xeons