Re: Use durable_unlink for .ready and .done files for WAL segment removal

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: bossartn(at)amazon(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Use durable_unlink for .ready and .done files for WAL segment removal
Date: 2018-11-22 04:16:09
Message-ID: 20181122041609.GG3369@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Nov 15, 2018 at 07:39:27PM +0900, Kyotaro HORIGUCHI wrote:
> At Fri, 02 Nov 2018 14:47:08 +0000, Nathan Bossart
> <bossartn(at)amazon(dot)com> wrote in
> <154117002849(dot)5569(dot)14588306221618961668(dot)pgcf(at)coridan(dot)postgresql(dot)org>:
>> One argument for instead checking WAL file existence before calling
>> archive_command might be to avoid the increased startup time.

I guess that you mean the startup of the archive command itself here.
Yes that can be an issue with a high WAL output depending on the
interpreter of the archive command :(

>> Granted, any added delay from this patch is unlikely to be noticeable
>> unless your archiver is way behind and archive_status has a huge
>> number of files. However, I have seen cases where startup is stuck on
>> other tasks like SyncDataDirectory() and RemovePgTempFiles() for a
>> very long time, so perhaps it is worth considering.

What's the scale of the pg_wal partition and the amount of time things
were stuck? I would imagine that the sync phase hurts the most, and a
fast startup time for crash recovery is always important.

> While archive_mode is tuned on, .ready files are created for all
> existing wal files if not exists. Thus archiver may wait for the
> earliest segment to have .ready file.

Yes, RemoveOldXlogFiles() does that via XLogArchiveCheckDone().

> As the result
> pgarch_readyXLog can be modified to loops over WAL files, not
> status files. This prevents the confusion comes from .ready
> files for non-existent segment files.

No, pgarch_readyXLog() should still look after .ready files as those are
here for this purpose, but we could have an additional check to see if
the segment linked with it actually exists and can be archived. This
check could happen in pgarch.c code before calling the archive command
gets called (just before pgarch_ArchiverCopyLoop and after
XLogArchiveCommandSet feels rather right, and that it should be cheap
enough to call stat()).
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2018-11-22 04:26:38 Re: Speeding up INSERTs and UPDATEs to partitioned tables
Previous Message Amit Kapila 2018-11-22 03:35:40 Re: zheap: a new storage format for PostgreSQL