RE: Speed up the removal of WAL files

From: "Tsunakawa, Takayuki" <tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com>
To: 'Michael Paquier' <michael(at)paquier(dot)xyz>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: RE: Speed up the removal of WAL files
Date: 2018-03-07 06:15:24
Message-ID: 0A3221C70F24FB45833433255569204D1F8FE802@G01JPEXMBYT05
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

From: Michael Paquier [mailto:michael(at)paquier(dot)xyz]
> On Wed, Mar 07, 2018 at 12:55:43AM +0900, Fujii Masao wrote:
> > So, what about, as another approach, making the checkpointer instead
> > of the startup process call RemoveNonParentXlogFiles() when
> > end-of-recovery checkpoint is executed? ISTM that a recovery doesn't
> > need to wait for
> > RemoveNonParentXlogFiles() to end. Instead, RemoveNonParentXlogFiles()
> > seems to have to complete before the checkpointer calls
> > RemoveOldXlogFiles() and creates .ready files for the "garbage" WAL files
> on the old timeline.
> > So it seems natual to leave that WAL recycle task to the checkpointer.
>
> Couldn't that impact the I/O performance at the end of recovery until the
> first post-recovery checkpoint is completed? Let's not forget that since
> 9.3 the end-of-recovery checkpoint is not triggered immediately, so there
> could be a delay. If WAL segments of the past timeline are recycled without
> waiting for this first checkpoint to happen then there is no need to create
> new, zero-emptied, segments post-recovery, which can count as well.

Good point. I understood you referred to PreallocXlogFiles(), which may create one new WAL file if RemoveNonParentXlogFiles() is not called or does not recycle WAL files in the old timeline.

The best hack (or a compromise/kludge?) seems to be:

1. Modify durable_xx() functions so that they don't fsync directory hanges when enableFsync is false.

2. RemoveNonParentXlogFiles() sets enableFsync to false before the while loop, restores the original value of it after the while loop, and fsync pg_wal/ just once.
What do you think?

Regards
Takayuki Tsunakawa

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2018-03-07 06:31:06 Re: Add default role 'pg_access_server_files'
Previous Message Andrey Borodin 2018-03-07 06:11:27 Re: [WIP PATCH] Index scan offset optimisation using visibility map