Re: corruption of WAL page header is never reported

From: Yugo NAGATA <nagata(at)sraoss(dot)co(dot)jp>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: corruption of WAL page header is never reported
Date: 2021-07-19 07:00:39
Message-ID: 20210719160039.23486c8b79d2e89a3a21a978@sraoss.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, 19 Jul 2021 15:14:41 +0900 (JST)
Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> wrote:

> Hello.
>
> At Sun, 18 Jul 2021 04:55:05 +0900, Yugo NAGATA <nagata(at)sraoss(dot)co(dot)jp> wrote in
> > Hello,
> >
> > I found that any corruption of WAL page header found during recovery is never
> > reported in log messages. If wal page header is broken, it is detected in
> > XLogReaderValidatePageHeader called from XLogPageRead, but the error messages
> > are always reset and never reported.
>
> Good catch! Currently recovery stops showing no reason if it is
> stopped by page-header errors.
>
> > I attached a patch to fix it in this way.
>
> However, it is a kind of a roof-over-a-roof. What we should do is
> just omitting the check in XLogPageRead while in standby mode.

Your patch doesn't fix the issue that the error message is never reported in
standby mode. When a WAL page header is broken, the standby would silently repeat
retrying forever.

I think we have to let users know the corruption of WAL page header even in
standby mode, not? A corruption of WAL record header is always reported,
by the way. (See that XLogReadRecord is calling ValidXLogRecordHeader.)

Regards,
Yugo Nagata

--
Yugo NAGATA <nagata(at)sraoss(dot)co(dot)jp>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2021-07-19 07:03:33 Re: Introduce pg_receivewal gzip compression tests
Previous Message Peter Eisentraut 2021-07-19 06:59:18 Re: automatically generating node support functions