Skip site navigation (1) Skip section navigation (2)

Re: Checkpoint spikes

From: Greg Smith <greg(at)2ndquadrant(dot)com>
To: Richard Neill <rn214(at)cam(dot)ac(dot)uk>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, PostgreSQL Performance <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Checkpoint spikes
Date: 2009-12-09 02:05:40
Message-ID: 4B1F05F4.6020305@2ndquadrant.com (view raw or flat)
Thread:
Lists: pgsql-performance
Richard Neill wrote:
> (does the advice for 8.3 apply unchanged to 8.4?)
Yes; no changes in this area for 8.4.  The main things performance 
related that changed between 8.3 and 8.4 are:
1) VACUUM free space management reimplemented so that the max_fsm_* 
parameters aren't needed anymore
2) default_statistics_target now starts at 100 instead of 10

> So far, I've set checkpoint_segments to 128, timeout to 10min, and
> completion_target to 0.8. This helps, but not as much as I'd hoped.
Good, if the problem is moving in the right direction you're making 
progress.

> But I haven't touched any of the other WAL or BG Writer settings.
> Where should I look next?
>   Should I be looking at the BG Writer settings,
>   or should I look at the Linux VM configuration?
>     (eg changing /proc/sys/vm/dirty_background_ratio from 5 to 1)
I would start by reducing dirty_background_ratio; as RAM sizes climb, 
this keeps becoming a bigger issue.  The whole disk flushing code 
finally got a major overhaul in the 2.6.32 Linux kernel, I'm hoping this 
whole class of problem was improved from the changes made.

Changes to the background writer behavior will probably not work as 
you'd expect.  The first thing I'd try it in your situation turning it 
off altogether; it can be slightly counterproductive for reducing 
checkpoint issues if they're really bad, which yours are.  If that goes 
in the wrong direction, experimenting with increasing the maximum pages 
and the multiplier might be useful, I wouldn't bet on it helping through.

As Kevin already mentioned, reducing the size of the buffer cache can 
help too.  That's worth trying if you're exhausted the other obvious 
possibilities.

>   Or would it be most useful to try to move the WAL to a different disk?
On Linux having the WAL on a separate disk can improve things much more 
than you might expect, simply because of how brain-dead the filesystem 
fsync implementation is.  Reducing the seeks for WAL traffic can help a 
lot too.

If you've lowered Linux's caching, tried some BGW tweaks, and moved the 
WAL to somewhere else, if latency is still high you may be facing a 
hardware upgrade to improve things.  Sometimes these problems just 
require more burst write throughput (regardless of how good average 
performance looks) and nothing else will substitute.  Hopefully you'll 
find a tuning solution before that though.


-- 
Greg Smith    2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
greg(at)2ndQuadrant(dot)com  www.2ndQuadrant.com


In response to

Responses

pgsql-performance by date

Next:From: Andres FreundDate: 2009-12-09 02:13:06
Subject: Re: Checkpoint spikes
Previous:From: Robert HaasDate: 2009-12-08 18:30:32
Subject: Re: Optimizing Bitmap Heap Scan.

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group