Re: High CPU Utilization

From: Joe Uhl <joeuhl(at)gmail(dot)com>
To: Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>
Cc: Greg Smith <gsmith(at)gregsmith(dot)com>, Gregory Stark <stark(at)enterprisedb(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: High CPU Utilization
Date: 2009-03-20 20:49:00
Message-ID: 468A01CE-F631-4B2F-8882-03C381503DA2@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance


On Mar 20, 2009, at 4:29 PM, Scott Marlowe wrote:

> On Fri, Mar 20, 2009 at 2:26 PM, Joe Uhl <joeuhl(at)gmail(dot)com> wrote:
>> On Mar 17, 2009, at 12:19 AM, Greg Smith wrote:
>>
>>> On Tue, 17 Mar 2009, Gregory Stark wrote:
>>>
>>>> Hm, well the tests I ran for posix_fadvise were actually on a
>>>> Perc5 --
>>>> though
>>>> who knows if it was the same under the hood -- and I saw better
>>>> performance
>>>> than this. I saw about 4MB/s for a single drive and up to about
>>>> 35MB/s
>>>> for 15
>>>> drives. However this was using linux md raid-0, not hardware raid.
>>>
>>> Right, it's the hardware RAID on the Perc5 I think people mainly
>>> complain
>>> about. If you use it in JBOD mode and let the higher performance
>>> CPU in
>>> your main system drive the RAID functions it's not so bad.
>>>
>>> --
>>> * Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com
>>> Baltimore, MD
>>
>> I have not yet had a chance to try software raid on the standby
>> server
>> (still planning to) but wanted to follow up to see if there was any
>> good way
>> to figure out what the postgresql processes are spending their CPU
>> time on.
>>
>> We are under peak load right now, and I have Zabbix plotting CPU
>> utilization
>> and CPU wait (from vmstat output) along with all sorts of other
>> vitals on
>> charts. CPU utilization is a sustained 90% - 95% and CPU Wait is
>> hanging
>> below 10%. Since being pointed at vmstat by this list I have been
>> watching
>> CPU Wait and it does get high at times (hence still wanting to try
>> Perc5 in
>> JBOD) but then there are sustained periods, right now included,
>> where our
>> CPUs are just getting crushed while wait and IO (only doing about
>> 1.5 MB/sec
>> right now) are very low.
>>
>> This high CPU utilization only occurs when under peak load and when
>> our JDBC
>> pools are fully loaded. We are moving more things into our cache and
>> constantly tuning indexes/tables but just want to see if there is
>> some
>> underlying cause that is killing us.
>>
>> Any recommendations for figuring out what our database is spending
>> its CPU
>> time on?
>
> What does the cs entry on vmstat say at this time? If you're cs is
> skyrocketing then you're getting a context switch storm, which is
> usually a sign that there are just too many things going on at once /
> you've got an old kernel things like that.

cs column (plus cpu columns) of vmtstat 1 30 reads as follows:

cs us sy id wa
11172 95 4 1 0
12498 94 5 1 0
14121 91 7 1 1
11310 90 7 1 1
12918 92 6 1 1
10613 93 6 1 1
9382 94 4 1 1
14023 89 8 2 1
10138 92 6 1 1
11932 94 4 1 1
15948 93 5 2 1
12919 92 5 3 1
10879 93 4 2 1
14014 94 5 1 1
9083 92 6 2 0
11178 94 4 2 0
10717 94 5 1 0
9279 97 2 1 0
12673 94 5 1 0
8058 82 17 1 1
8150 94 5 1 1
11334 93 6 0 0
13884 91 8 1 0
10159 92 7 0 0
9382 96 4 0 0
11450 95 4 1 0
11947 96 3 1 0
8616 95 4 1 0
10717 95 3 1 0

We are running on 2.6.28.7-2 kernel. I am unfamiliar with vmstat
output but reading the man page (and that cs = "context switches per
second") makes my numbers seem very high.

Our sum JDBC pools currently top out at 400 connections (and we are
doing work on all 400 right now). I may try dropping those pools down
even smaller. Are there any general rules of thumb for figuring out
how many connections you should service at maximum? I know of the
memory constraints, but thinking more along the lines of connections
per CPU core.

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Scott Marlowe 2009-03-20 20:58:18 Re: High CPU Utilization
Previous Message Scott Marlowe 2009-03-20 20:29:33 Re: High CPU Utilization