Re: pg_walinspect - a new extension to get raw WAL data and WAL stats

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To: Greg Stark <stark(at)mit(dot)edu>
Cc: Jeremy Schneider <schneider(at)ardentperf(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, SATYANARAYANA NARLAPURAM <satyanarlapuram(at)gmail(dot)com>, marvin_liang(at)qq(dot)com, actyzhang(at)outlook(dot)com, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>
Subject: Re: pg_walinspect - a new extension to get raw WAL data and WAL stats
Date: 2022-02-05 14:31:20
Message-ID: CALj2ACXoLDUM8SbqS0m+8rqm68pcnbReaj42V1+LOtGntCkpDg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Feb 1, 2022 at 3:10 AM Greg Stark <stark(at)mit(dot)edu> wrote:
>
> So I looked at this patch and I have the same basic question as Bruce.

Thanks a lot for the comments.

> Do we really want to expose every binary tool associated with Postgres
> as an extension? Obviously this is tempting for cloud provider users
> which is not an unreasonable argument. But it does have consequences.

Perhaps not every tool needs to be exposed, but given the advantages
that pg_walinspect can provide it's a good candidate to have it as a
core extension. Some of the advantages are - debugging, WAL analysis,
feeding WAL stats and info to dashboards to show customers and answer
their queries, RCA etc., for educational purposes - one can understand
the WAL structure, stats, different record types etc. Another nice
thing is getting raw WAL data out of the running server (of course all
the users can't get it only the allowed ones, currently users with
pg_monitor role, if required we can change it to some other predefined
role). For instance, the raw WAL data can be fed to external page
repair tools to apply on a raw page (one can get this from pageinspect
extension).

> 1) Some things like pg_waldump are running code that is not normally
> under user control. This could have security issues or reliability
> issues.

I understand this and also I think the same concern applies to
pageinspect tool which exposes getting raw page data. The
pg_walinspect functions are currently accessible by the users with
pg_monitor role, if required we can change this to some other
predefined role.

> On that front I'm especially concerned that pg_verify_raw_wal_record()
> for example would let an attacker feed arbitrary hand crafted xlog
> records into the parser which is not normally something a user can do.
> If they feed it something it's not expecting it might be easy to cause
> a crash and server restart.

This function does nothing (no writes) to the server but just checking
the CRC of the WAL record. If at all one can make the server crash
with an input, then that should be a problem with the server code
which needs to be fixed. But IMO this function doesn't have a concern
as such.

> There's also a bit of concern about data retention. Generally in
> Postgres when rows are deleted there's very weak guarantees about the
> data really being wiped. We definitely don't wipe it from disk of
> course. And things like pageinspect could expose it long after it's
> been deleted. But one might imagine after pageinspect shows it's gone
> and/or after a vacuum full the data is actually purged. But then
> something like pg_walinspect would make even that insufficient.

The idea of pg_walinspect is to get the WAL info, data and stats out
of a running postgres server, if the WAL isn't available, the
functions would say that.

> 2) There's no documentation. I'm guessing you hesitated to write
> documentation until the interface is settled but actually sometimes
> writing documentation helps expose things in the interface that look
> strange when you try to explain them.

I will send out the new patch set with documentation soon.

> 3) And the interface does look a bit strange. Like what's the deal
> with pg_get_wal_record_info_2() ? I gather it's just a SRF version of
> pg_get_wal_record_info() but that's a strange name. And then what's
> the point of pg_get_wal_record_info() at all? Why wouldn't the SRF be
> sufficient even for the specific case of a single record?

I agree, pg_get_wal_record_info_2 is a poor naming.
pg_get_wal_record_info_2 takes range of LSN (start and end) to give
the wal info, whereas pg_get_wal_record_info just takes one LSN. Maybe
I will change pg_get_wal_record_info_2 to pg_get_wal_record_info_range
or pg_get_wal_records_info or someother namign is better? If the
suggestion is to overload pg_get_wal_record_info one with single LSN
and another with start and end LSNs, I'm okay with that too.
Otherwise, I can have pg_get_wal_record_info with start and end LSN
(end LSN default to NULL) and return setof record.

> And the stats functions seem a bit out of place to me. If the SRF
> returned the data in the right format the user should be able to do
> aggregate queries to generate these stats easily enough. If anything a
> simple SQL function to do the aggregations could be provided.
>
> Now this is starting to get into the realm of bikeshedding but... Some
> of the code is taken straight from pg_waldump and does things like:
>
> + appendStringInfo(&rec_blk_ref, "blkref #%u: rel %u/%u/%u fork %s blk %u",
>
> But that's kind of out of place for an SQL interface. It makes it hard
> to write queries since things like the relid, block number etc are in
> the string. If I wanted to use these functions I would expect to be
> doing something like "select all the decoded records pertaining to
> block n".

I will think more about this and change it in the next version of the
patch set, perhaps I will add more columns to the functions.

> All that said, I don't want to gatekeep based on this kind of
> criticism. The existing code is based on pg_waldump and if we want an
> extension to expose that then that's a reasonable place to start. We
> can work on a better format for the data later it doesn't mean we
> shouldn't start with something we have today.

IMO, we can always extend the functions in future, once the
pg_walinspect extension gets in with minimum number of much-required
and basic functions.

> I suppose this raises the issue of what happens if someone fixes that.
> They'll now have to update pg_waldump *and* pg_walinspect. I don't
> think that would actually be a lot of work but it's definitely more
> than just one. Also, perhaps they should be in the same contrib
> directory so at least people won't forget there are two.

Currently, all the tools are placed in src/bin and extensions are in
contrib directory. I don't think we ever keep the extension in src/bin
or vice versa. Having said, this maybe we can add comments on having
to change/fix in both pg_waldump and pg_walinspect. We also have to
deal with this situation in some of the existing tools such as
pg_controldata.

Regards,
Bharath Rupireddy.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2022-02-05 19:59:44 Re: Synchronizing slots from primary to standby
Previous Message Tomas Vondra 2022-02-05 13:32:40 Re: Condition pushdown: why (=) is pushed down into join, but BETWEEN or >= is not?