Skip site navigation (1) Skip section navigation (2)

Re: FW: Queries becoming slow under heavy load

From: Shaun Thomas <sthomas(at)peak6(dot)com>
To: Anne Rosset <arosset(at)collab(dot)net>
Cc: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: FW: Queries becoming slow under heavy load
Date: 2011-01-26 17:16:38
Message-ID: 4D4056F6.3010807@peak6.com (view raw or flat)
Thread:
Lists: pgsql-performance
On 01/26/2011 10:04 AM, Anne Rosset wrote:

> We've been able to match  long running database queries to such
> processes. This occurs under relatively low load average (say 4 out of
> 8) and can involve as little as 1 single sql query.

The b state means the process is blocking, waiting for... something. One 
thing you need to consider is far more than your CPU usage. If you have 
the 'sar' utility, run it as 'sar 1 100' just to see how your system is 
working. What you want to watch for is iowait.

If even one query is churning your disks, every single other query that 
has to take even one block from disk instead of cache, is going to 
stall. If you see an iowait of anything greater than 5%, you'll want to 
check further on the device that contains your database with iostat. My 
favorite use of this is 'iostat -dmx [device] 1' where [device] is the 
block device where your data files are, if your WAL is somewhere else.

And yeah, your shared_buffers are kinda on the lowish side. Your 
effective_cache_size is good, but you have a lot more room to increase 
PG-specific memory.

Worse however, is your checkpoints. Lord. Increase checkpoint_segments 
to *at least* 20, and increase your checkpoint_completion_target to 0.7 
or 0.8. Check your logs for checkpoint warnings, and I'll bet it's 
constantly complaining about increasing your checkpoint segments. Every 
checkpoint not started by the scheduled system risks a checkpoint spike, 
which can flood your system with IO regardless of which queries are 
running. That kind of IO storm will ruin your performance, and with only 
3 checkpoint segments on a busy database, are probably happening constantly.

Unfortunately we still need to know more. This is just based on your PG 
settings, and that's not really enough to know how "busy" your DB is. 
One way to check is to log the contents of pg_stat_database, especially 
the xact_commit and xact_rollback columns. Grab those with a timestamp. 
If you get a snapshot of that every minute, you can figure out how many 
queries you're processing per minute or per second pretty easily. We've 
hit 8600 TPS before and don't have nearly the trouble you've been reporting.

-- 
Shaun Thomas
OptionsHouse | 141 W. Jackson Blvd. | Suite 800 | Chicago IL, 60604
312-676-8870
sthomas(at)peak6(dot)com

______________________________________________

See  http://www.peak6.com/email_disclaimer.php
for terms and conditions related to this email

In response to

Responses

pgsql-performance by date

Next:From: David GrecoDate: 2011-01-26 19:31:58
Subject: Real vs Int performance
Previous:From: Anne RossetDate: 2011-01-26 16:04:41
Subject: FW: Queries becoming slow under heavy load

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group