Re: Checkpoint spikes

From: Greg Smith <greg(at)2ndquadrant(dot)com>
To: Richard Neill <rn214(at)cam(dot)ac(dot)uk>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, PostgreSQL Performance <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Checkpoint spikes
Date: 2009-12-09 02:05:40
Message-ID: 4B1F05F4.6020305@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Richard Neill wrote:
> (does the advice for 8.3 apply unchanged to 8.4?)
Yes; no changes in this area for 8.4. The main things performance
related that changed between 8.3 and 8.4 are:
1) VACUUM free space management reimplemented so that the max_fsm_*
parameters aren't needed anymore
2) default_statistics_target now starts at 100 instead of 10

> So far, I've set checkpoint_segments to 128, timeout to 10min, and
> completion_target to 0.8. This helps, but not as much as I'd hoped.
Good, if the problem is moving in the right direction you're making
progress.

> But I haven't touched any of the other WAL or BG Writer settings.
> Where should I look next?
> Should I be looking at the BG Writer settings,
> or should I look at the Linux VM configuration?
> (eg changing /proc/sys/vm/dirty_background_ratio from 5 to 1)
I would start by reducing dirty_background_ratio; as RAM sizes climb,
this keeps becoming a bigger issue. The whole disk flushing code
finally got a major overhaul in the 2.6.32 Linux kernel, I'm hoping this
whole class of problem was improved from the changes made.

Changes to the background writer behavior will probably not work as
you'd expect. The first thing I'd try it in your situation turning it
off altogether; it can be slightly counterproductive for reducing
checkpoint issues if they're really bad, which yours are. If that goes
in the wrong direction, experimenting with increasing the maximum pages
and the multiplier might be useful, I wouldn't bet on it helping through.

As Kevin already mentioned, reducing the size of the buffer cache can
help too. That's worth trying if you're exhausted the other obvious
possibilities.

> Or would it be most useful to try to move the WAL to a different disk?
On Linux having the WAL on a separate disk can improve things much more
than you might expect, simply because of how brain-dead the filesystem
fsync implementation is. Reducing the seeks for WAL traffic can help a
lot too.

If you've lowered Linux's caching, tried some BGW tweaks, and moved the
WAL to somewhere else, if latency is still high you may be facing a
hardware upgrade to improve things. Sometimes these problems just
require more burst write throughput (regardless of how good average
performance looks) and nothing else will substitute. Hopefully you'll
find a tuning solution before that though.

--
Greg Smith 2ndQuadrant Baltimore, MD
PostgreSQL Training, Services and Support
greg(at)2ndQuadrant(dot)com www.2ndQuadrant.com

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Andres Freund 2009-12-09 02:13:06 Re: Checkpoint spikes
Previous Message Robert Haas 2009-12-08 18:30:32 Re: Optimizing Bitmap Heap Scan.