Re: pg_walinspect - a new extension to get raw WAL data and WAL stats

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To: RKN Sai Krishna <rknsaiforpostgres(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Jeff Davis <pgsql(at)j-davis(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Greg Stark <stark(at)mit(dot)edu>, Jeremy Schneider <schneider(at)ardentperf(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, SATYANARAYANA NARLAPURAM <satyanarlapuram(at)gmail(dot)com>, marvin_liang(at)qq(dot)com, actyzhang(at)outlook(dot)com
Subject: Re: pg_walinspect - a new extension to get raw WAL data and WAL stats
Date: 2022-03-26 05:01:01
Message-ID: CALj2ACXBH6n40sgYZvtkYzkatYteL+bHftc-rEcOmWcmrfnF=A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Mar 25, 2022 at 8:37 PM RKN Sai Krishna
<rknsaiforpostgres(at)gmail(dot)com> wrote:
>
> Hi Bharath,
>
> First look at the patch, bear with me if any of the following comments are repeated.

Thanks RKN, for playing around with the patches.

> 1. With pg_get_wal_record(lsn), say a WAL record start, end lsn range contains the specified LSN, wouldn't it be more meaningful to show the corresponding WAL record.

In general, all the functions will first look for a first valid WAL
record from the given input lsn/start lsn(XLogFindNextRecord) and then
give info of all the valid records including the first valid WAL
record until either the given end lsn or till end of WAL depending on
the function used.

> For example, upon providing '0/17335E7' as input, and I see get the WAL record ('0/1733618', '0/173409F') as output and not the one with start and end lsn as ('0/17335E0', '0/1733617').

If '0/17335E7' is an LSN containing a valid WAL record,
pg_get_wal_record gives the info of that, otherwise if there's any
next valid WAL record, it finds and gives that info. '0/17335E0' is
before '0/17335E7' the input lsn, so it doesn't show that record, but
the next valid record.

All the pg_walinspect functions don't look for the nearest valid WAL
record (could be previous to input lsn or next to input lsn), but they
look for the next valid WAL record. This is because the xlogreader
infra now has no API for backward iteration from a given LSN ( it has
XLogFindNextRecord and XLogReadRecord which scans the WAL in forward
direction). But, it's a good idea to have XLogFindPreviousRecord and
XLogReadPreviousRecord versions (as we have links for previous WAL
record in each WAL record) but that's a separate discussion.

> With pg_walfile_name(lsn), we can find the WAL segment file name that contains the specified LSN.

Yes.

> 2. I see the following output for pg_get_wal_record. Need to have a look at the spaces I suppose.

I believe this is something psql does for larger column outputs for
pretty-display. When used in a non-psql client, the column values are
returned properly. Nothing to do with the pg_walinspect patches here.

> 3. Should these functions be running in standby mode too? We do not allow WAL control functions to be executed during recovery right?

There are functions that can be executable during recovery
pg_last_wal_receive_lsn, pg_last_wal_replay_lsn. The pg_walinspect
functions are useful even in recovery and I don't see a strong reason
to not support them. Hence, I'm right now supporting them.

> 4. If the wal segment corresponding to the start lsn is removed, but there are WAL records which could be read in the specified input lsn range, would it be better to output the existing WAL records displaying a message that it is a partial list of WAL records and the WAL files corresponding to the rest are already removed, rather than erroring out saying "requested WAL segment has already been removed"?

"requested WAL segment %s has already been removed" is a common error
across the xlogreader infra (see wal_segment_open) and I don't want to
invent a new behaviour. And all the segment_open callbacks report an
error when they are not finding the WAL file that they are looking
for.

> 5. Following are very minor comments in the code
>
> Correct the function description by removing "return the LSN up to which the server has WAL" for IsFutureLSN

That's fine, because it actually returns curr_lsn via the function
param curr_lsn. However, I modified the comment a bit.

> In GetXLogRecordInfo, good to have pfree in place for rec_desc, rec_blk_ref, data

No, we are just returning pointer to the string, not deep copying, see
CStringGetTextDatum. All the functions get executed within a
function's memory context and after handing off the results to the
client that gets deleted, deallocating all the memory.

> In GetXLogRecordInfo, can avoid calling XLogRecGetInfo(record) multiple times by capturing in a variable

XLogRecGetInfo is not a function, it's a macro, so that's fine.
#define XLogRecGetInfo(decoder) ((decoder)->record->header.xl_info)

> In GetWALDetailsGuts, setting end_lsn could be done in single if else and similarly we can club the if statements verifying if the start lsn is a future lsn.

The existing if conditions are:

if (IsFutureLSN(start_lsn, &curr_lsn))
if (!till_end_of_wal && end_lsn >= curr_lsn)
if (till_end_of_wal)
if (start_lsn >= end_lsn)

I clubbed them like this:
if (!till_end_of_wal)
if (IsFutureLSN(start_lsn, &curr_lsn))
if (!till_end_of_wal && end_lsn >= curr_lsn)
else if (till_end_of_wal)

Other if conditions are serving different purposes, so I'm leaving them as-is.

Attaching v16 patch-set, only change in v16-0002-pg_walinspect.patch,
others remain the same.

Regards,
Bharath Rupireddy.

Attachment Content-Type Size
v16-0001-Refactor-pg_waldump-code.patch application/octet-stream 18.3 KB
v16-0002-pg_walinspect.patch application/octet-stream 30.2 KB
v16-0003-pg_walinspect-tests.patch application/octet-stream 13.0 KB
v16-0004-pg_walinspect-docs.patch application/octet-stream 12.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bharath Rupireddy 2022-03-26 05:21:15 Remove an unused function GetWalRcvWriteRecPtr
Previous Message Shinoda, Noriyoshi (PN Japan FSIP) 2022-03-26 04:09:17 RE: Column Filtering in Logical Replication