Re: Redesigning checkpoint_segments

From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Venkata Balaji N <nag1010(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Redesigning checkpoint_segments
Date: 2015-05-21 15:40:40
Message-ID: CAHGQGwFcGe86zWvHbNe6yzJAWj+AKnsWu8ZvQXxnChG0pQkwnA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, May 21, 2015 at 3:53 PM, Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:
> On Mon, Mar 16, 2015 at 11:05 PM, Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:
>>
>> On Mon, Feb 23, 2015 at 8:56 AM, Heikki Linnakangas
>> <hlinnakangas(at)vmware(dot)com> wrote:
>>>
>>>
>>> Everyone seems to be happy with the names and behaviour of the GUCs, so
>>> committed.
>>
>>
>>
>> The docs suggest that max_wal_size will be respected during archive
>> recovery (causing restartpoints and recycling), but I'm not seeing that
>> happening. Is this a doc bug or an implementation bug?
>
>
> I think the old behavior, where restartpoints were driven only by time and
> not by volume, was a misfeature. But not a bug, because it was documented.
>
> One of the points of max_wal_size and its predecessor is to limit how big
> pg_xlog can grow. But running out of disk space on pg_xlog is no more fun
> during archive recovery than it is during normal operations. So why
> shouldn't max_wal_size be active during recovery?

The following message of commit 7181530 explains why.

In standby mode, respect checkpoint_segments in addition to
checkpoint_timeout to trigger restartpoints. We used to deliberately only
do time-based restartpoints, because if checkpoint_segments is small we
would spend time doing restartpoints more often than really necessary.
But now that restartpoints are done in bgwriter, they're not as
disruptive as they used to be. Secondly, because streaming replication
stores the streamed WAL files in pg_xlog, we want to clean it up more
often to avoid running out of disk space when checkpoint_timeout is large
and checkpoint_segments small.

Previously users were more likely to fall into this trouble (i.e., too frequent
occurrence of restartpoints) because the default value of checkpoint_segments
was very small, I guess. But we increased the default of max_wal_size, so now
the risk of that trouble seems to be smaller than before, and maybe we can
allow max_wal_size to trigger restartpoints.

Regards,

--
Fujii Masao

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2015-05-21 15:42:39 Re: GROUPING
Previous Message Andrew Gierth 2015-05-21 15:19:27 Re: GROUPING