Skip site navigation (1) Skip section navigation (2)

High CPU Utilization

From: Joe Uhl <joeuhl(at)gmail(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: High CPU Utilization
Date: 2009-03-16 18:48:48
Message-ID: 8609D1C0-3ECD-4A4F-B9C6-1FC8496EDC7A@gmail.com (view raw or flat)
Thread:
Lists: pgsql-performance
Our production database is seeing very heavy CPU utilization - anyone  
have any ideas/input considering the following?

CPU utilization gradually increases during the day until it approaches  
90%-100% at our peak time.  When this happens our transactions/sec  
drops and our site becomes very slow.  When in this state, I can see  
hundreds of queries in pg_stat_activity that are not waiting on locks  
but sit there for minutes.  When the database is not in this state,  
those same queries can complete in fractions of a second - faster that  
my script that watches pg_stat_activity can keep track of them.

This server has dual quad core xeon 5310s, 32 GB RAM, and a few  
different disk arrays (all managed in hardware by either the Perc5/i  
or Perc5/e adapter).  The Postgres data is on a 14 disk 7.2k SATA raid  
10.  This server runs nothing but Postgres.

The PostgreSQL database (according to pg_database_size) is 55GB and we  
are running PostgreSQL 8.3.5 and the 2.6.28.7-2 kernel under Arch Linux.

Right now (not under peak load) this server is running at 68% CPU  
utilization and its SATA raid 10 is doing about 2MB/s writes and 11MB/ 
s reads.  When I run dd I can hit 200+MB/s writes and 230+ MB/s reads,  
so we are barely using the available IO.  Further when I run dd the  
CPU utilization of that process only approaches 20%-30% of one core.

Additionally, when I view "top -c" I generally see a dozen or so  
"idle" postgres processes (they appear and drop away quickly though)  
consuming very large chunks of CPU (as much as 60% of a core).  At any  
given time we have hundreds of idle postgres processes due to the JDBC  
connection pooling but most of them are 0% as I would expect them to  
be.  I also see selects and inserts consuming very large percentages  
of CPU but I view that as less strange since they are doing work.

Any ideas as to what is causing our CPUs to struggle?  Is the fact  
that our RAM covers a significant portion of the database causing our  
CPUs to do a bunch of thrashing as they work with memory while our  
disk controllers sit idle?  According to top we barely use any swap.

We currently have max_connections set to 1000 (roughly the sum of the  
JDBC pools on our application servers).  Would decreasing this value  
help?  We can decrease the JDBC pools or switch to pgbouncer for  
pooling if this is the case.

Really just looking for any input/ideas.  Our workload is primarily  
OLTP in nature - essentially a social network.  By transactions/sec at  
the start I am using the xact_commit value in pg_stat_database.   
Please let me know if this value is not appropriate for getting a tps  
guess.  Right now with the 60% CPU utilization and low IO use  
xact_commit is increasing at a rate of 1070 a second.

I have an identical PITR slave I can pause the PITR sync on to run any  
test against.  I will happily provide any additional information that  
would be helpful.

Any assistance is greatly appreciated.

Joe Uhl

Responses

pgsql-performance by date

Next:From: Scott MarloweDate: 2009-03-16 19:13:38
Subject: Re: deployment query
Previous:From: Scott CareyDate: 2009-03-16 18:44:34
Subject: Re: Proposal of tunable fix for scalability of 8.4

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group