Re: archive status ".ready" files may be created too early

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: bossartn(at)amazon(dot)com
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: archive status ".ready" files may be created too early
Date: 2019-12-17 10:25:12
Message-ID: 20191217.192512.1598132387026445578.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Uggg. I must apologyze for the last bogus comment.

At Fri, 13 Dec 2019 21:24:36 +0000, "Bossart, Nathan" <bossartn(at)amazon(dot)com> wrote in
> On 12/12/19, 8:08 PM, "Kyotaro Horiguchi" <horikyota(dot)ntt(at)gmail(dot)com> wrote:
> > As the result the patch doesn't seem to save anything than setting up
> > and operating correctly.
>
> Disregarding the behavior of standby servers for a minute, I think

I'm sorry. a continuation record split beyond a segment boundary
doesn't seem to harm replication. Please forget it.

> that what I've described is still a problem for archiving. If the

Yeah, I think that happens and it seems a problem.

> segment is archived too early, point-in-time restores that require it
> will fail. If the server refuses to overwrite existing archive files,
> the archiver process may fail to process the "good" version of the
> segment until someone takes action to fix it. I think this is
> especially troubling for backup utilities like pgBackRest that check
> the archive_status directory independently since it is difficult to
> know if the segment is truly ".ready".
>
> I've attached a slightly improved patch to show how this might be
> fixed. I am curious what concerns there are about doing something
> like it to prevent this scenario.

Basically, I agree to the direction, where the .ready notification is
delayed until all requested WAL bytes are written out.

But I think I found a corner case where the patch doesn't work. As I
mentioned in another message, if WAL buffer was full,
AdvanceXLInsertBuffer calls XLogWrite to write out the victim buffer
regardless whether the last record in the page was the first half of a
continuation record. XLogWrite can mark the segment as .ready even
with the patch.

Is that correct? And do you think the corner case is worth amending?

If so, we could amend also that case by marking the last segment as
.ready when XLogWrite writes the first bytes of the next segment. (As
the further corner-case, it still doesn't work if a contination record
spans over trhee or more segments.. But I don't (or want not to) think
we don't need to consider that case..)

Thoughts?

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2019-12-17 10:27:09 Re: Allow cluster owner to bypass authentication
Previous Message Josef Šimánek 2019-12-17 09:23:51 Re: [PATCH] Improve documentation of REINDEX options