Re: corruption of WAL page header is never reported

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: masao(dot)fujii(at)oss(dot)nttdata(dot)com
Cc: alvherre(at)alvh(dot)no-ip(dot)org, nagata(at)sraoss(dot)co(dot)jp, ranier(dot)vf(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: corruption of WAL page header is never reported
Date: 2021-09-13 02:00:04
Message-ID: 20210913.110004.2070315095617522330.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At Fri, 10 Sep 2021 10:38:39 +0900, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> wrote in
>
>
> On 2021/09/07 2:02, Fujii Masao wrote:
> > Even if we do this while NOT in standby mode, ISTM that this function
> > doesn't
> > return with a valid errmsg_buf because it's reset. So probably the
> > comment
> > should be updated as follows?
> > -------------------------
> > We don't do this while not in standby mode because we don't need to
> > retry
> > immediately if the page header is not valid. Instead, XLogReadRecord()
> > is
> > responsible to check the page header.
> > -------------------------
>
> I updated the comment as above. Patch attached.
>
> - * it's not valid. This may seem unnecessary, because XLogReadRecord()
> + * it's not valid. This may seem unnecessary, because
> ReadPageInternal()
> * validates the page header anyway, and would propagate the failure up to
>
> I also applied this change because ReadPageInternal() not
> XLogReadRecord()
> calls XLogReaderValidatePageHeader().

Yeah, good catch.

+ * Note that we don't do this while not in standby mode because we don't
+ * need to retry immediately if the page header is not valid. Instead,
+ * ReadPageInternal() is responsible for validating the page header.

The point here is "retry this page, not this record", so "we don't need
to retry immediately" looks a bit ambiguous. So how about something
like this?

Note that we don't do this while not in standby mode because we don't
need to avoid retrying this entire record even if the page header is
not valid. Instead, ReadPageInternal() is responsible for validating
the page header in that case.

Everything else looks fine to me.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2021-09-13 02:15:51 Re: Added missing invalidations for all tables publication
Previous Message Kyotaro Horiguchi 2021-09-13 01:31:07 walsender timeout on logical replication set