Re: detailed error message of pg_waldump

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: detailed error message of pg_waldump
Date: 2021-07-05 07:04:27
Message-ID: CAD21AoDRP2KpQ1BuOG7BjgZhMX+7ksTghp9TdNjM_jU25QgXEA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jun 16, 2021 at 5:36 PM Kyotaro Horiguchi
<horikyota(dot)ntt(at)gmail(dot)com> wrote:
>
> Thanks!
>
> At Wed, 16 Jun 2021 16:52:11 +0900, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote in
> > On Fri, Jun 4, 2021 at 5:35 PM Kyotaro Horiguchi
> > <horikyota(dot)ntt(at)gmail(dot)com> wrote:
> > >
> > > In a very common operation of accidentally specifying a recycled
> > > segment, pg_waldump often returns the following obscure message.
> > >
> > > $ pg_waldump 00000001000000000000002D
> > > pg_waldump: fatal: could not find a valid record after 0/2D000000
> > >
> > > The more detailed message is generated internally and we can use it.
> > > That looks like the following.
> > >
> > > $ pg_waldump 00000001000000000000002D
> > > pg_waldump: fatal: unexpected pageaddr 0/24000000 in log segment 00000001000000000000002D, offset 0
> > >
> > > Is it work doing?
> >
> > Perhaps we need both? The current message describes where the error
> > happened and the message internally generated describes the details.
> > It seems to me that both are useful. For example, if we find an error
> > during XLogReadRecord(), we show both as follows:
> >
> > if (errormsg)
> > fatal_error("error in WAL record at %X/%X: %s",
> > LSN_FORMAT_ARGS(xlogreader_state->ReadRecPtr),
> > errormsg);
>
> Yeah, I thought that it might be a bit vervous and lengty but actually
> we have another place where doing that. One more point is whether we
> have a case where first_record is invalid but errormsg is NULL
> there. WALDumpReadPage immediately exits so we should always have a
> message in that case according to the comment in ReadRecord.
>
> > * We only end up here without a message when XLogPageRead()
> > * failed - in that case we already logged something. In
> > * StandbyMode that only happens if we have been triggered, so we
> > * shouldn't loop anymore in that case.
>
> So that can be an assertion.
>
> Now the messages looks like this.
>
> $ pg_waldump /home/horiguti/data/data_work/pg_wal/000000020000000000000010
> pg_waldump: fatal: could not find a valid record after 0/0: unexpected pageaddr 0/9000000 in log segment 000000020000000000000010, offset 0
>

Thank you for updating the patch!

+ *
+ * The returned pointer (or *errormsg) points to an internal buffer that's
+ * valid until the next call to XLogFindNextRecord or XLogReadRecord.
*/

The comment of XLogReadRecord() also has a similar description. Should
we update it as well?

BTW is this patch registered to the current commitfest? I could not find it.

Regards,

--
Masahiko Sawada
EDB: https://www.enterprisedb.com/

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2021-07-05 07:23:43 Re: Diagnostic comment in LogicalIncreaseXminForSlot
Previous Message Ronan Dunklau 2021-07-05 06:38:28 Re: Add proper planner support for ORDER BY / DISTINCT aggregates