Re: Temporary WAL segments files not cleaned up after an instance crash

From: Yugo Nagata <nagata(at)sraoss(dot)co(dot)jp>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Postgres hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Temporary WAL segments files not cleaned up after an instance crash
Date: 2018-07-12 06:35:53
Message-ID: 20180712153553.f26563ef.nagata@sraoss.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, 14 May 2018 14:49:55 +0900
Michael Paquier <michael(at)paquier(dot)xyz> wrote:

> Hi all,
>
> While playing with a standby as follows I noticed that xlogtemp.*
> generated in pg_wal may stay around when entering crash recovery. The
> test I was conducting is pretty simple:
> - Use a primary and a standby.
> - Run pgbench on the primary.
> - Then restart the standby with -m immediate and force WAL segment
> switch on the primary in a loop. Depending on the timing, one can see
> that those xlogtemp files stay around. Those files are here when
> creating a new segment from scratch and append the PID of the process
> creating them. Any previous file existing with the same name is
> unlinked.
>
> The problem is that if an instance is not really stable for a reason or
> another and starts crash recovery periodically, then there is a risk of
> accumulating those temporary files. If pg_wal is on its own partition,
> tuned by max_wal_size, then there is a risk to run into ENOSPC and take
> PostgreSQL down as new WAL segments cannot be created.
>
> Shouldn't those files be cleaned up at the beginning of crash recovery?
> Attached is a proposal of patch doing so.

I think it makes sense to remove unnecessary temporary WAL files although
I'm not sure how high the risk of ENOSPC is.

The code looks fine, the patch can be applied to HEAD, and I can build this
successfully. I confirmed that all tempxlog.* files are removed when restarting
postgres that was shutdown immediately.

One little thing I noticed is the function name "RemoveXLogTempFiles".
Other similar functions are named as RemoveOldXlogFiles or RemoveXlogFile
(using Xlog not XLog), so it seem to me more consistent to rename this
"RemoveXlogTempFiles" or "RemoveTempXlogFiles" and so on.

Regards

--
Yugo Nagata <nagata(at)sraoss(dot)co(dot)jp>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Yugo Nagata 2018-07-12 06:58:08 Re: [PG-11] Potential bug related to INCLUDE clause of CREATE INDEX
Previous Message Dilip Kumar 2018-07-12 06:09:53 Re: partition pruning doesn't work with IS NULL clause in multikey range partition case