Re: Make mesage at end-of-recovery less scary.

From: Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc: pashkin(dot)elfe(at)gmail(dot)com, Michael Paquier <michael(at)paquier(dot)xyz>, "Bossart, Nathan" <bossartn(at)amazon(dot)com>, David Steele <david(at)pgmasters(dot)net>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, jtc331(at)gmail(dot)com, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: Make mesage at end-of-recovery less scary.
Date: 2022-02-08 13:05:34
Message-ID: CAE9k0PmtTcFoi-KkEk33O3JVWXU9aw7gXNfwo2zt4Hh6Fq950g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Here are some of my review comments on the v11 patch:

- (errmsg_internal("reached end of WAL in
pg_wal, entering archive recovery")));
+ (errmsg_internal("reached end of WAL at %X/%X
on timeline %u in %s during crash recovery, entering archive
recovery",
+ LSN_FORMAT_ARGS(ErrRecPtr),
+ replayTLI,
+ xlogSourceNames[currentSource])));

Why crash recovery? Won't this message get printed even during PITR?

I just did a PITR and could see these messages in the logfile.

2022-02-08 18:00:44.367 IST [86185] LOG: starting point-in-time
recovery to WAL location (LSN) "0/5227790"
2022-02-08 18:00:44.368 IST [86185] LOG: database system was not
properly shut down; automatic recovery in progress
2022-02-08 18:00:44.369 IST [86185] LOG: redo starts at 0/14DC8D8
2022-02-08 18:00:44.978 IST [86185] DEBUG1: reached end of WAL at
0/3FFFFD0 on timeline 1 in pg_wal during crash recovery, entering
archive recovery

==

+ /*
+ * If we haven't emit an error message, we have safely reached the
+ * end-of-WAL.
+ */
+ if (emode_for_corrupt_record(LOG, ErrRecPtr) == LOG)
+ {
+ char *fmt;
+
+ if (StandbyMode)
+ fmt = gettext_noop("reached end of WAL at %X/%X on
timeline %u in %s during standby mode");
+ else if (InArchiveRecovery)
+ fmt = gettext_noop("reached end of WAL at %X/%X on
timeline %u in %s during archive recovery");
+ else
+ fmt = gettext_noop("reached end of WAL at %X/%X on
timeline %u in %s during crash recovery");
+
+ ereport(LOG,
+ (errmsg(fmt, LSN_FORMAT_ARGS(ErrRecPtr), replayTLI,
+ xlogSourceNames[currentSource]),
+ (errormsg ? errdetail_internal("%s", errormsg) : 0)));
+ }

Doesn't it make sense to add an assert statement inside this if-block
that will check for xlogreader->EndOfWAL?

==

- * We only end up here without a message when XLogPageRead()
- * failed - in that case we already logged something. In
- * StandbyMode that only happens if we have been triggered, so we
- * shouldn't loop anymore in that case.
+ * If we get here for other than end-of-wal, emit the error
+ * message right now. Otherwise the message if any is shown as a
+ * part of the end-of-WAL message below.
*/

For consistency, I think we can replace "end-of-wal" with
"end-of-WAL". Please note that everywhere else in the comments you
have used "end-of-WAL". So why not the same here?

==

ereport(LOG,
- (errmsg("replication terminated by
primary server"),
- errdetail("End of WAL reached on
timeline %u at %X/%X.",
- startpointTLI,
-
LSN_FORMAT_ARGS(LogstreamResult.Write))));
+ (errmsg("replication terminated by
primary server on timeline %u at %X/%X.",
+ startpointTLI,
+
LSN_FORMAT_ARGS(LogstreamResult.Write))));

Is this change really required? I don't see any issue with the
existing error message.

==

Lastly, are we also planning to backport this patch?

--
With Regards,
Ashutosh Sharma.

On Wed, Feb 2, 2022 at 11:05 AM Kyotaro Horiguchi
<horikyota(dot)ntt(at)gmail(dot)com> wrote:
>
> At Tue, 1 Feb 2022 12:38:01 +0400, Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com> wrote in
> > Maybe it can be written little bit shorter:
> > pe = (char *) record + XLOG_BLCKSZ - (RecPtr & (XLOG_BLCKSZ - 1));
> > as
> > pe = p + XLOG_BLCKSZ - (RecPtr & (XLOG_BLCKSZ - 1));
> > ?
>
> That difference would be a matter of taste, but I found it looks
> cleaner that definition and assignment is separated for both p and pe.
> Now it is like the following.
>
> > char *p;
> > char *pe;
> >
> > /* scan from the beginning of the record to the end of block */
> > p = (char *) record;
> > pe = p + XLOG_BLCKSZ - (RecPtr & (XLOG_BLCKSZ - 1));
>
>
> > The problem that pgindent sometimes reflow formatting of unrelated blocks
> > is indeed existing. But I think it's right to manually leave pgindent-ed
> > code only on what is related to the patch. The leftover is pgindent-ed in a
> > scheduled manner sometimes, so don't need to bother.
>
> Yeah, I meant that it is a bit annoying to unpginden-ting unrelated
> edits:p
>
> > I'd like to set v10 as RfC.
>
> Thanks! The suggested change is done in the attached v11.
>
> regards.
>
> --
> Kyotaro Horiguchi
> NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2022-02-08 13:10:08 Re: row filtering for logical replication
Previous Message Julien Rouhaud 2022-02-08 12:55:18 Re: Database-level collation version tracking