Re: Make mesage at end-of-recovery less scary.

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: michael(at)paquier(dot)xyz
Cc: bossartn(at)amazon(dot)com, david(at)pgmasters(dot)net, peter(dot)eisentraut(at)2ndquadrant(dot)com, andres(at)anarazel(dot)de, pgsql-hackers(at)lists(dot)postgresql(dot)org, jtc331(at)gmail(dot)com, robertmhaas(at)gmail(dot)com
Subject: Re: Make mesage at end-of-recovery less scary.
Date: 2021-11-09 07:27:51
Message-ID: 20211109.162751.1330179292938994347.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thank you for the comments!

At Tue, 9 Nov 2021 09:53:15 +0900, Michael Paquier <michael(at)paquier(dot)xyz> wrote in
> On Mon, Nov 08, 2021 at 02:59:46PM +0900, Kyotaro Horiguchi wrote:
> > While checking the behavior for the case of missing-contrecord, I
> > noticed that emode_for_corrupt_record() doesn't work as expected since
> > readSource is reset to XLOG_FROM_ANY after a read failure. We could
> > remember the last failed source but pg_wal should have been visited if
> > page read error happened so I changed the function so that it treats
> > XLOG_FROM_ANY the same way with XLOG_FROM_PG_WAL.
>
> FWIW, I am not much a fan of assuming that it is fine to use
> XLOG_FROM_ANY as a condition here. The comments on top of
> emode_for_corrupt_record() make it rather clear what the expectations
> are, and this is the default readSource.

The readSource is expected by the function to be the failed source but
it goes back to XLOG_FROM_ANY on page read failure. So the function
*is* standing on the wrong assumption. I noticed that currentSource
holds the last accessed source (but forgot about that). So it is
exactly what we need here. No longer need to introduce the unclear
assumption by using it.

> > (Otherwise we see "LOG: reached end-of-WAL at .." message after
> > "WARNING: missing contrecord at.." message.)
>
> + /* broken record found */
> + ereport(WARNING,
> + (errmsg("redo is skipped"),
> + errhint("This suggests WAL file corruption. You might need to check the database.")));
> This looks rather scary to me, FWIW, and this could easily be reached

Yes, the message is intentionally scary, since we don't come here in
the case of clean WAL:)

> if one forgets about EndOfWAL while hacking on xlogreader.c.
> Unlikely so, still.

I don't understand. Isn't it the case of almost every feature?

The patch compells hackers to maintain the condition for recovery
being considered cleanly ended. If the last record doesn't meet the
condition, the WAL file should be considered having a
problem. However, I don't see the condition expanded to have another
term in future.

Even if someone including myself broke that condition, we will at
worst unwantedly see a scary message. And I believe almost all
hackers can easily find it a bug from the DETAILED message shown along
aside. I'm not sure such bugs could be found in development phase,
though..

> + report_invalid_record(state,
> + "missing contrecord at %X/%X",
> + LSN_FORMAT_ARGS(RecPtr));
> Isn't there a risk here to break applications checking after error
> messages stored in the WAL reader after seeing a contrecord?

I'm not sure you are mentioning the case where no message is stored
previously, or the case where already a message is stored. The former
is fine as the record is actually broken. But I was missing the latter
case. In this version I avoided to overwite the error message.

> + if (record->xl_tot_len == 0)
> + {
> + /* This is strictly not an invalid state, so phrase it as so. */
> + report_invalid_record(state,
> + "record length is 0 at %X/%X",
> + LSN_FORMAT_ARGS(RecPtr));
> + state->EndOfWAL = true;
> + return false;
> + }
> This assumes that a value of 0 for xl_tot_len is a synonym of the end
> of WAL, but cannot we have also a corrupted record in this case in the
> shape of xl_tot_len being 0? We validate the full record after
> reading the header, so it seems to me that we should not assume that
> things are just ending as proposed in this patch.

Yeah, it's the most serious concern to me. So I didn't hide the
detailed message in the "end-of-wal reached message".

> LOG: reached end of WAL at 0/512F198 on timeline 1 in pg_wal during crash recovery
> DETAIL: record length is 0 at 0/512F210

I believe everyone regards zero record length as fine unless something
wrong is seen afterwards. However, we can extend the check to the
whole record header. I think it is by far nearer to the perfect for
almost all cases. The attached emits the following message for the
good (true end-of-WAL) case.

> LOG: reached end of WAL at 0/512F4A0 on timeline 1 in pg_wal during crash recovery
> DETAIL: empty record header found at 0/512F518

If garbage bytes are found in the header area, the following log will
be left. I think we can have a better message here.

> WARNING: garbage record header at 0/2095458
> LOG: redo done at 0/2095430 system usage: CPU: user: 0.03 s, system: 0.01 s, elapsed: 0.04 s

This is the updated version.

- emode_for_currupt_record() now uses currentSource instead of
readSource.

- If zero record length is faced, make sure the whole header is zeroed
before deciding it as the end-of-WAL.

- Do not overwrite existig message when missing contrecord is
detected. The message added here is seen in the TAP test log
026_overwrite_contrecord_standby.log

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment Content-Type Size
v6-0001-Make-End-Of-Recovery-error-less-scary.patch text/x-patch 9.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dilip Kumar 2021-11-09 07:40:04 Re: Skipping logical replication transactions on subscriber side
Previous Message Masahiko Sawada 2021-11-09 06:43:11 Re: Skipping logical replication transactions on subscriber side