Re: [9.3 bug] disk space in pg_xlog increases during archive recovery

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc: MauMau <maumau307(at)gmail(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [9.3 bug] disk space in pg_xlog increases during archive recovery
Date: 2014-02-02 15:18:07
Message-ID: 20140202151807.GO5930@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2014-02-02 23:50:40 +0900, Fujii Masao wrote:
> On Sun, Feb 2, 2014 at 5:49 AM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> > On 2014-01-24 22:31:17 +0900, MauMau wrote:
> >> I haven't tried reducing checkpoint_timeout.
> >
> > Did you try reducing checkpoint_segments? As I pointed out, at least if
> > standby_mode is enabled, it will also trigger checkpoints, independently
> > from checkpoint_timeout.
>
> Right. If standby_mode is enabled, checkpoint_segment can trigger
> the restartpoint. But the problem is that the timing of restartpoint
> depends on not only the checkpoint parameters (i.e.,
> checkpoint_timeout and checkpoint_segments) that are used during
> archive recovery but also the checkpoint WAL that was generated
> by the master.

Sure. But we really *need* all the WAL since the last checkpoint's redo
location locally to be safe.

> For example, could you imagine the case where the master generated
> only one checkpoint WAL since the last backup and it crashed with
> database corruption. Then DBA decided to perform normal archive
> recovery by using the last backup. In this case, even if DBA reduces
> both checkpoint_timeout and checkpoint_segments, only one
> restartpoint can occur during recovery. This low frequency of
> restartpoint might fill up the disk space with lots of WAL files.

I am not sure I understand the point of this scenario. If the primary
crashed after a checkpoint, there won't be that much WAL since it
happened...

> > If the issue is that you're not using standby_mode (if so, why?), then
> > the fix maybe is to make that apply to a wider range of situations.
>
> I guess that he is not using standby_mode because, according to
> his first email in this thread, he said he would like to prevent WAL
> from accumulating in pg_xlog during normal archive recovery (i.e., PITR).

Well, that doesn't necessarily prevent you from using
standby_mode... But yes, that might be the case.

I wonder if we shouldn't just always look at checkpoint segments during
!crash recovery.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2014-02-02 15:23:16 Re: pg_basebackup and pg_stat_tmp directory
Previous Message Andres Freund 2014-02-02 15:13:19 Misaligned BufferDescriptors causing major performance problems on AMD