Skip site navigation (1) Skip section navigation (2)

Re: Server hitting 100% CPU usage, system comes to a crawl.

From: Brian Fehrle <brianf(at)consistentstate(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-general General <pgsql-general(at)postgresql(dot)org>
Subject: Re: Server hitting 100% CPU usage, system comes to a crawl.
Date: 2011-10-27 21:22:33
Message-ID: 4EA9CB99.8090808@consistentstate.com (view raw or flat)
Thread:
Lists: pgsql-general
On 10/27/2011 02:50 PM, Tom Lane wrote:
> Brian Fehrle<brianf(at)consistentstate(dot)com>  writes:
>> Hi all, need some help/clues on tracking down a performance issue.
>> PostgreSQL version: 8.3.11
>> I've got a system that has 32 cores and 128 gigs of ram. We have
>> connection pooling set up, with about 100 - 200 persistent connections
>> open to the database. Our applications then use these connections to
>> query the database constantly, but when a connection isn't currently
>> executing a query, it's<IDLE>. On average, at any given time, there are
>> 3 - 6 connections that are actually executing a query, while the rest
>> are<IDLE>.
>> About once a day, queries that normally take just a few seconds slow way
>> down, and start to pile up, to the point where instead of just having
>> 3-6 queries running at any given time, we get 100 - 200. The whole
>> system comes to a crawl, and looking at top, the CPU usage is 99%.
> This is jumping to a conclusion based on insufficient data, but what you
> describe sounds a bit like the sinval queue contention problems that we
> fixed in 8.4.  Some prior reports of that:
> http://archives.postgresql.org/pgsql-performance/2008-01/msg00001.php
> http://archives.postgresql.org/pgsql-performance/2010-06/msg00452.php
>
> If your symptoms match those, the best fix would be to update to 8.4.x
> or later, but a stopgap solution would be to cut down on the number of
> idle backends.
>
> 			regards, tom lane
That sounds somewhat close to the same issue I am seeing. Main 
differences being that my spike lasts for much longer than a few 
minutes, and can only be resolved when the cluster is restarted. Also, 
that second link shows TOP where much of the CPU is via the 'user', 
rather than the 'sys' like mine.

Is there anything I can look at more to get more info on this 'sinval 
que contention problem'?

Also, having my cpu usage high in 'sys' rather than 'us', could that be 
a red flag? Or is that normal?

- Brian F

In response to

Responses

pgsql-general by date

Next:From: Josh BerkusDate: 2011-10-27 22:06:31
Subject: PostgreSQL at LISA in Boston: Dec. 7-8
Previous:From: Tom LaneDate: 2011-10-27 21:18:06
Subject: Re: Getting X coordinate from a point(lseg), btw i read the man page about points.

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group