Re: pg_walinspect - a new extension to get raw WAL data and WAL stats

From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
Cc: RKN Sai Krishna <rknsaiforpostgres(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Greg Stark <stark(at)mit(dot)edu>, Jeremy Schneider <schneider(at)ardentperf(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, SATYANARAYANA NARLAPURAM <satyanarlapuram(at)gmail(dot)com>, marvin_liang(at)qq(dot)com, actyzhang(at)outlook(dot)com
Subject: Re: pg_walinspect - a new extension to get raw WAL data and WAL stats
Date: 2022-04-06 05:02:43
Message-ID: d3d60da5fd9d06282a833253fbcb4fc92531dae4.camel@j-davis.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, 2022-04-04 at 09:15 +0530, Bharath Rupireddy wrote:
> My intention is to return the overall undecoded WAL record [5] i.e.
> the data starting from XLogReadRecord's output [6] till length
> XLogRecGetTotalLen(xlogreader);. Please see [7], where Andres agreed
> to have this function, I also mentioned a possible use-case there.

The current patch does not actually do this: it's returning a pointer
into the DecodedXLogRecord struct, which doesn't have the raw bytes of
the WAL record.

To return the raw bytes of the record is not entirely trivial: it seems
we have to look in the decoded record and either find a pointer into
readBuf, or readRecordBuf, depending on whether the record crosses a
boundary or not. If we find a good way to do this I'm fine keeping the
function, but if not, we can leave it for v16.

> pg_get_wal_record_info returns the main data of the WAL record
> (xl_heap_delete, xl_heap_insert, xl_heap_multi_insert, xl_heap_update
> and so on).

We also discussed just removing the main data from the output here.
It's not terribly useful, and could be arbitrarily large. Similar to
how we leave out the backup block data and images.

> As identified in [1], SQL-version of stats function is way slower in
> normal cases, hence it was agreed (by Andres, Kyotaro-san and myself)
> to have a C-function for stats.a pointer into

Now I agree. We should also have an equivalent of "pg_waldump --
stats=record" though, too.

> Yes, that's already part of the description column (much like
> pg_waldump does) and the users can still do it with GROUP BY and
> HAVING clauses, see [4].

I still think an extra column for the results of rm_identify() would
make sense. Not critical, but seems useful.

> As mentioned in [1], read_local_xlog_page_no_wait required because
> the
> functions can still wait in read_local_xlog_page for WAL while
> finding
> the first valid record after the given input LSN (the use case is
> simple - just provide input LSN closer to server's current flush LSN,
> may be off by 3 or 4 bytes).

Did you look into using XLogReadAhead() rather than XLogReadRecord()?

> It's pretty much clear to the users with till_end_of_wal functions
> instead of cooking many things into the same functions with default
> values for input LSNs as NULL which also requires the functions to be
> "CALLED ON NULL INPUT" types which isn't good. This was also
> suggested
> by Andres, see [2], and I agree with it.

OK, it's a matter of taste I suppose. I don't have a strong opinion.

> > 2. Rename pg_get_wal_record_info -> pg_get_wal_record
> >
> > 3. Rename pg_get_wal_records_info -> pg_get_wal_records
>
> As these functions aren't returning the WAL record data, but info
> about it (decoded data), I would like to retain the function names
> as-is.

The name pg_get_wal_records_info bothers me slightly, but I don't have
a better suggestion.

One other thought: functions like pg_logical_emit_message() return an
LSN, but if you feed that into pg_walinspect you get the *next* record.
That makes sense because pg_logical_emit_message() returns the result
of XLogInsertRecord(), which is the end of the last inserted record.
But it can be slightly annoying/confusing. I don't have any particular
suggestion, but maybe it's worth a mention in the docs or something?

Regards,
Jeff Davis

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2022-04-06 05:08:21 Re: zero char is returned as space
Previous Message Konstantin Izmailov 2022-04-06 04:58:17 zero char is returned as space