Re: Maximum number of WAL files in the pg_xlog directory

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Guillaume Lelarge <guillaume(at)lelarge(dot)info>
Cc: PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Maximum number of WAL files in the pg_xlog directory
Date: 2014-11-03 20:39:26
Message-ID: CAMkU=1xmwNE2CZ2Yg8aYb71j9sLwa-JPFpnaPXNxwJoeVOfnxQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Oct 15, 2014 at 1:11 PM, Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:

> On Fri, Aug 8, 2014 at 12:08 AM, Guillaume Lelarge <guillaume(at)lelarge(dot)info
> > wrote:
>
>> Hi,
>>
>> As part of our monitoring work for our customers, we stumbled upon an
>> issue with our customers' servers who have a wal_keep_segments setting
>> higher than 0.
>>
>> We have a monitoring script that checks the number of WAL files in the
>> pg_xlog directory, according to the setting of three parameters
>> (checkpoint_completion_target, checkpoint_segments, and wal_keep_segments).
>> We usually add a percentage to the usual formula:
>>
>> greatest(
>> (2 + checkpoint_completion_target) * checkpoint_segments + 1,
>> checkpoint_segments + wal_keep_segments + 1
>> )
>>
>
> I think the first bug is even having this formula in the documentation to
> start with, and in trying to use it.
>
> "and will normally not be more than..."
>
> This may be "normal" for a toy system. I think that the normal state for
> any system worth monitoring is that it has had load spikes at some point in
> the past.
>
> So it is the next part of the doc, which describes how many segments it
> climbs back down to upon recovering from a spike, which is the important
> one. And that doesn't mention wal_keep_segments at all, which surely
> cannot be correct.
>
> I will try to independently derive the correct formula from the code, as
> you did, without looking too much at your derivation first, and see if we
> get the same answer.
>

It looked to me that the formula, when descending from a previously
stressed state, would be:

greatest(1 + checkpoint_completion_target) * checkpoint_segments,
wal_keep_segments) + 1 +
2 * checkpoint_segments + 1

This assumes logs are filled evenly over a checkpoint cycle, which is
probably not true because there is a spike in full page writes right after
a checkpoint starts.

But I didn't have a great deal of confidence in my analysis.

The first line reflects the number of WAL that will be retained as-is, the
second is the number that will be recycled for future use before starting
to delete them.

My reading of the code is that wal_keep_segments is computed from the
current end of WAL (i.e the checkpoint record), not from the checkpoint
redo point. If I distribute the part outside the 'greatest' into both
branches of the 'greatest', I don't get the same answer as you do for
either branch.

Then I started wondering if the number we keep for recycling is a good
choice, anyway. 2 * checkpoint_segments + 1 seems pretty large. But then
again, given that we've reached the high-water-mark once, how unlikely are
we to reach it again?

Cheers,

Jeff

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Abhijit Menon-Sen 2014-11-03 20:40:55 Re: pgaudit - an auditing extension for PostgreSQL
Previous Message Alvaro Herrera 2014-11-03 20:36:48 Re: tracking commit timestamps