Re: Add LSN along with offset to error messages reported for WAL file read/write/validate header failures

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Maxim Orlov <orlovmg(at)gmail(dot)com>
Cc: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Add LSN along with offset to error messages reported for WAL file read/write/validate header failures
Date: 2022-11-15 10:02:32
Message-ID: CALj2ACWuSumsFt1Fm4-LESg4jf9BDtMZcu5YJUZ62eohnc1Drg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Nov 11, 2022 at 5:52 PM Maxim Orlov <orlovmg(at)gmail(dot)com> wrote:
>
> Hi!
>
> I've watched over the patch and consider it useful. Applies without conflicts. The functionality of the patch itself is
> meets declared functionality.

Thanks for reviewing.

> But, in my view, some improvements may be proposed. We should be more, let's say, cautious (or a paranoid if you wish),
> in pg_walfile_offset_lsn while dealing with user input values. What I mean by that is:
> - to init input vars of sscanf, i.e. tli, log andseg;
> - check the return value of sscanf call;
> - check filename max length.

IsXLogFileName() will take care of this. Also, I've added a new inline
function XLogIdFromFileName() that parses file name and returns log
and seg along with XLogSegNo and timeline id. This new function avoids
an extra sscanf() call as well.

> Another question arises for me: is this function can be called during recovery? If not, we should ereport about this, should we?

It's just a utility function and doesn't depend on any of the server's
current state (unlike pg_walfile_name()), hence it doesn't matter if
this function is called during recovery.

> And one last note here: pg_walfile_offset_lsn is accepting NULL values and return NULL in this case. From a theoretical
> point of view, this is perfectly fine. Actually, I think this is exactly how it supposed to be, but I'm not sure if there are no other opinions here.

These functions are marked as 'STRICT', meaning a null is returned,
without even calling the function, if any of the input parameters is
null. I think we can keep the behaviour the same as its friends.

On Fri, Nov 11, 2022 at 11:05 PM Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> wrote:
>

Thanks for reviewing.

> Hmm, in 0002, why not return the timeline from the LSN too? It seems a
> waste not to have it.

Yeah, that actually makes sense. We might be tempted to return segno
too, but it's not something that we emit to the user elsewhere,
whereas we emit timeline id.

> + ereport(ERROR,
> + (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
> + errmsg("\"offset\" must not be negative or greater than or "
> + "equal to WAL segment size")));
>
> I don't think the word offset should be in quotes; and please don't cut
> the line. So I propose
>
> errmsg("offset must not be negative or greater than or equal to the WAL segment size")));

Changed.

While on this, I noticed that the pg_walfile_name_offset() isn't
covered in tests. I took an opportunity and added a simple test case
along with pg_walfile_offset_lsn().

I'm attaching the v3 patch set for further review.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment Content-Type Size
v3-0001-Add-LSN-along-with-offset-to-error-messages-repor.patch application/x-patch 7.7 KB
v3-0002-Introduce-pg_walfile_offset_lsn.patch application/x-patch 8.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Gustafsson 2022-11-15 10:09:54 Re: pg_basebackup's --gzip switch misbehaves
Previous Message Drouvot, Bertrand 2022-11-15 09:48:40 Re: Split index and table statistics into different types of stats