Re: Speed up the removal of WAL files

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Speed up the removal of WAL files
Date: 2017-11-17 08:03:53
Message-ID: 20171117.170353.156223714.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello,

At Fri, 17 Nov 2017 06:35:41 +0000, "Tsunakawa, Takayuki" <tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com> wrote in <0A3221C70F24FB45833433255569204D1F81B0C8(at)G01JPEXMBYT05>
> Hello,
>
> The attached patch speeds up the removal of WAL files in the old timelines. I'll add this to the next CF.
>
>
> BACKGROUND
> ==================================================
>
> We need to meet a severe availability requirement of a potential customer. They will use synchronous streaming replication. The allowed failover duration, from the failure through failure detection to the failover completion, is 10 seconds. Even one second is precious.
>
> During a testing on a fast machine with SSD, we observed about 2 seconds between these messages. There were no other messages between them.
>
> LOG: archive recovery complete
> LOG: MultiXact member wraparound protections are now enabled
>
>
> CAUSE
> ==================================================
>
> Examining the source code, RemoveNonParentXlogFiles() seems to account for the time. It syncs pg_wal directory every time it deletes a WAL file. max_wal_size was set to 48GB, so about 1,000 WAL files were probably deleted and hence the pg_wal directory was synced as much.
>
>
> FIX
> ==================================================
>
> unlink() the WAL files, then sync the pg_wal directory once at the end.
>
> Unfortunately, the original machine is now not available, so I confirmed the speedup on a VM with HDD.
>
> [time to remove 1,000 WAL files including the directory sync]
> nonpatched: 2.45 seconds
> patched: 0.81 seconds
>
>
> Regards
> Takayuki Tsunakawa

The orinal code recycles some of the to-be-removed files, but the
patch removes all the victims. This may impact on performance.

Likewise the original code is using durable_unlink to actually
remove a file so separating unlink and fsync might resurrect the
problem that should have been fixed by
1b02be21f271db6bd3cd43abb23fa596fcb6bac3 (I'm not sure what it
was but you are one of the reviwers of it). I suppose that you
need to explain the reason why this change doesn't risk anything.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tsunakawa, Takayuki 2017-11-17 08:20:36 RE: Speed up the removal of WAL files
Previous Message Kyotaro HORIGUCHI 2017-11-17 07:35:43 Re: [HACKERS] Walsender timeouts and large transactions