Re: WAL partition filling up after high WAL activity

From: Greg Smith <greg(at)2ndQuadrant(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: WAL partition filling up after high WAL activity
Date: 2011-11-09 16:06:46
Message-ID: 4EBAA516.2070307@2ndQuadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 11/07/2011 05:18 PM, Richard Yen wrote:
> My biggest question is: we know from the docs that there should be no
> more than (2 + checkpoint_completion_target) * checkpoint_segments + 1
> files. For us, that would mean no more than 48 files, which equates
> to 384MB--far lower than the 9.7GB partition size. **Why would WAL
> use up so much disk space?**
>

That's only true if things are operating normally. There are at least
two ways this can fail to be a proper upper limit on space used:

1) You are archiving to a second system, and the archiving isn't keeping
up. Things that haven't been archived can't be re-used, so more disk
space is used.

2) Disk I/O is slow, and the checkpoint writes take a significant period
of time. The internal scheduling assumes each individual write will
happen without too much delay. That assumption can easily be untrue on
a busy system. The worst I've seen now are checkpoints that take 6
hours to sync, where the time is supposed to be a few seconds. Disk
space in that case was a giant multiple of checkpoint_segments. (The
source of that problem is very much improved in PostgreSQL 9.1)

The info needed to figure out which category you're in would appear
after tuning log_checkpoints on in the postgresql.conf ; you only need
to reload the server config after that, doesn't require a restart. I
would guess you have realy long sync times there.

As for what to do about it, checkpoint_segments=16 is a low setting.
You might as well set it to a large number, say 128, and let checkpoints
get driven by time instead. The existing limit isn't working
effectively anyway, and having more segments lets the checkpoint
spreading code work more evenly.

--
Greg Smith 2ndQuadrant US greg(at)2ndQuadrant(dot)com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message kzsolt 2011-11-09 19:48:13 Re: Heavy contgnous load
Previous Message Kevin Grittner 2011-11-09 15:15:35 Re: Subquery in a JOIN not getting restricted?