Re: corruption of WAL page header is never reported

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: masao(dot)fujii(at)oss(dot)nttdata(dot)com
Cc: nagata(at)sraoss(dot)co(dot)jp, ranier(dot)vf(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: corruption of WAL page header is never reported
Date: 2021-09-02 07:26:13
Message-ID: 20210902.162613.733026087360305493.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At Thu, 2 Sep 2021 14:44:31 +0900, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> wrote in
>
>
> On 2021/09/02 13:17, Kyotaro Horiguchi wrote:
> > Did you read the comment just above?
>
> Yes!

Glad to hear that, or..:p

> > xlog.c:12523
> >> * Check the page header immediately, so that we can retry immediately if
> >> * it's not valid. This may seem unnecessary, because XLogReadRecord()
> >> * validates the page header anyway, and would propagate the failure up to
> >> * ReadRecord(), which would retry. However, there's a corner case with
> >> * continuation records, if a record is split across two pages such that
> > So when not in standby mode, the same check is performed by xlogreader
> > which has the responsibility to validate the binary data read by
> > XLogPageRead. The page-header validation is a compromise to save a
> > specific case.
>
> Yes, so XLogPageRead() can skip the validation check of page head if
> not
> in standby mode. On the other hand, there is no problem if it still
> performs
> the validation check as it does for now. No?

Practically yes, and it has always been like that as you say.

> > I don't think it is good choice to conflate read-failure and header
> > validation failure from the view of modularity.
>
> I don't think that the proposed change does that. But maybe I failed

It's about your idea in a recent mail. not about the proposed
patch(es).

> to get
> your point yet... Anyway the proposed change just tries to reset
> errormsg_buf whenever XLogPageRead() retries, whatever error happened
> before. Also if errormsg_buf is set at that moment, it's reported.

I believe errmsg_buf is an interface to emit error messages dedicated
to xlogreader that doesn't have an access to elog facility, and
xlogreader doesn't (or ought not to or expect to) suppose
non-xlogreader callback functions set the variable. In that sense I
don't think theoriginally proposed patch is proper for the reason that
the non-xlogreader callback function may set errmsg_buf. This is what
I meant by the word "modularity".

For that reason I avoided in my second proposal to call
XLogReaderValidatePageHeader() at all while not in standby mode,
because calling the validator function while in non-standby mode
results in the non-xlogreader function return errmsg_buf. Of course
we can instead always consume errmsg_buf in the function but I don't
like to shadow the caller's task.

Does that makes sense?

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2021-09-02 07:28:29 Re: pg_receivewal starting position
Previous Message Ronan Dunklau 2021-09-02 07:02:57 Re: pg_receivewal starting position