Re: .ready and .done files considered harmful

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: bossartn(at)amazon(dot)com
Cc: robertmhaas(at)gmail(dot)com, dipesh(dot)pandit(at)gmail(dot)com, jeevan(dot)ladhe(at)enterprisedb(dot)com, sfrost(at)snowman(dot)net, andres(at)anarazel(dot)de, hannuk(at)google(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: .ready and .done files considered harmful
Date: 2021-08-06 01:26:06
Message-ID: 20210806.102606.1567426789036874091.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At Tue, 3 Aug 2021 20:46:57 +0000, "Bossart, Nathan" <bossartn(at)amazon(dot)com> wrote in
> + /*
> + * Perform a full directory scan to identify the next log segment. There
> + * may be one of the following scenarios which may require us to perform a
> + * full directory scan.
> + *
> + * 1. This is the first cycle since archiver has started and there is no
> + * idea about the next anticipated log segment.
> + *
> + * 2. There is a timeline switch, i.e. the timeline ID tracked at archiver
> + * does not match with current timeline ID. Archive history file as part of
> + * this timeline switch.
> + *
> + * 3. The next anticipated log segment is not available.
>
> One benefit of the current implementation of pgarch_readyXlog() is
> that .ready files created out of order will be prioritized before
> segments with greater LSNs. IIUC, with this patch, as long as there
> is a "next anticipated" segment available, the archiver won't go back
> and archive segments it missed. I don't think the archive status
> files are regularly created out of order, but XLogArchiveCheckDone()
> has handling for that case, and the work to avoid creating .ready
> files too early [0] seems to make it more likely. Perhaps we should
> also force a directory scan when we detect that we are creating a
> .ready file for a segment that is older than the "next anticipated"
> segment.
>
> Nathan
>
> [0] https://postgr.es/m/DA71434B-7340-4984-9B91-F085BC47A778%40amazon.com

It works the current way always at the first iteration of
pgarch_ArchiveCopyLoop() becuse in the last iteration of
pgarch_ArchiveCopyLoop(), pgarch_readyXlog() erases the last
anticipated segment. The shortcut works only when
pgarch_ArchiveCopyLoop archives more than once successive segments at
once. If the anticipated next segment found to be missing a .ready
file while archiving multiple files, pgarch_readyXLog falls back to
the regular way.

So I don't see the danger to happen perhaps you are considering.

In the first place, .ready are added while holding WALWriteLock in
XLogWrite, and while removing old segments after a checkpoint (which
happens while recovery). Assuming that no one manually remove .ready
files on an active server, the former is the sole place doing that. So
I don't see a chance that .ready files are created out-of-order way.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bossart, Nathan 2021-08-06 02:34:24 Re: .ready and .done files considered harmful
Previous Message Bossart, Nathan 2021-08-06 00:21:34 Re: archive status ".ready" files may be created too early