Re: corruption of WAL page header is never reported

From: Yugo NAGATA <nagata(at)sraoss(dot)co(dot)jp>
To: Ranier Vilela <ranier(dot)vf(at)gmail(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: corruption of WAL page header is never reported
Date: 2021-07-18 14:27:16
Message-ID: 20210718232716.4a22d1fd673c0f633f68e5ac@sraoss.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, 17 Jul 2021 18:40:02 -0300
Ranier Vilela <ranier(dot)vf(at)gmail(dot)com> wrote:

> Em sáb., 17 de jul. de 2021 às 16:57, Yugo NAGATA <nagata(at)sraoss(dot)co(dot)jp>
> escreveu:
>
> > Hello,
> >
> > I found that any corruption of WAL page header found during recovery is
> > never
> > reported in log messages. If wal page header is broken, it is detected in
> > XLogReaderValidatePageHeader called from XLogPageRead, but the error
> > messages
> > are always reset and never reported.
> >
> > if (!XLogReaderValidatePageHeader(xlogreader, targetPagePtr,
> > readBuf))
> > {
> > /* reset any error XLogReaderValidatePageHeader() might
> > have set */
> > xlogreader->errormsg_buf[0] = '\0';
> > goto next_record_is_invalid;
> > }
> >
> > Since the commit 06687198018, we call XLogReaderValidatePageHeader here so
> > that
> > we can check a page header and retry immediately if it's invalid, but the
> > error
> > message is reset immediately and not reported. I guess the reason why the
> > error
> > message is reset is because we might get the right WAL after some retries.
> > However, I think it is better to report the error for each check in order
> > to let
> > users know the actual issues founded in the WAL.
> >
> > I attached a patch to fix it in this way.
> >
> I think to keep the same behavior as before, is necessary always run:
>
> /* reset any error XLogReaderValidatePageHeader() might have set */
> xlogreader->errormsg_buf[0] = '\0';
>
> not?

If we are not in StandbyMode, the check is not retried, and an error is returned
immediately. So, I think ,we don't have to display an error message in such cases,
and neither reset it. Instead, it would be better to leave the error message
handling to the caller of XLogReadRecord.

Regards,
Yugo Nagat

--
Yugo NAGATA <nagata(at)sraoss(dot)co(dot)jp>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Dilip Kumar 2021-07-18 15:45:03 Re: Toast compression method options
Previous Message Ranier Vilela 2021-07-18 13:08:49 Re: Remove redundant strlen call in ReplicationSlotValidateName