Re: pg_walinspect - a new extension to get raw WAL data and WAL stats

From: Greg Stark <stark(at)mit(dot)edu>
To: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
Cc: Jeremy Schneider <schneider(at)ardentperf(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, SATYANARAYANA NARLAPURAM <satyanarlapuram(at)gmail(dot)com>, marvin_liang(at)qq(dot)com, actyzhang(at)outlook(dot)com, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>
Subject: Re: pg_walinspect - a new extension to get raw WAL data and WAL stats
Date: 2022-01-31 21:40:09
Message-ID: CAM-w4HMVn1D3zyi6OM-8eMqZizLOReB4z3fTGUwDujTvaqy=vA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

So I looked at this patch and I have the same basic question as Bruce.
Do we really want to expose every binary tool associated with Postgres
as an extension? Obviously this is tempting for cloud provider users
which is not an unreasonable argument. But it does have consequences.

1) Some things like pg_waldump are running code that is not normally
under user control. This could have security issues or reliability
issues.

On that front I'm especially concerned that pg_verify_raw_wal_record()
for example would let an attacker feed arbitrary hand crafted xlog
records into the parser which is not normally something a user can do.
If they feed it something it's not expecting it might be easy to cause
a crash and server restart.

There's also a bit of concern about data retention. Generally in
Postgres when rows are deleted there's very weak guarantees about the
data really being wiped. We definitely don't wipe it from disk of
course. And things like pageinspect could expose it long after it's
been deleted. But one might imagine after pageinspect shows it's gone
and/or after a vacuum full the data is actually purged. But then
something like pg_walinspect would make even that insufficient.

2) There's no documentation. I'm guessing you hesitated to write
documentation until the interface is settled but actually sometimes
writing documentation helps expose things in the interface that look
strange when you try to explain them.

3) And the interface does look a bit strange. Like what's the deal
with pg_get_wal_record_info_2() ? I gather it's just a SRF version of
pg_get_wal_record_info() but that's a strange name. And then what's
the point of pg_get_wal_record_info() at all? Why wouldn't the SRF be
sufficient even for the specific case of a single record?

And the stats functions seem a bit out of place to me. If the SRF
returned the data in the right format the user should be able to do
aggregate queries to generate these stats easily enough. If anything a
simple SQL function to do the aggregations could be provided.

Now this is starting to get into the realm of bikeshedding but... Some
of the code is taken straight from pg_waldump and does things like:

+ appendStringInfo(&rec_blk_ref, "blkref #%u: rel %u/%u/%u fork %s blk %u",

But that's kind of out of place for an SQL interface. It makes it hard
to write queries since things like the relid, block number etc are in
the string. If I wanted to use these functions I would expect to be
doing something like "select all the decoded records pertaining to
block n".

All that said, I don't want to gatekeep based on this kind of
criticism. The existing code is based on pg_waldump and if we want an
extension to expose that then that's a reasonable place to start. We
can work on a better format for the data later it doesn't mean we
shouldn't start with something we have today.

4) This isn't really an issue with your patch at all but why on earth
do we have a bitvector for WAL compression methods?! Like, what does
it mean to have multiple compression methods set? That should just be
a separate field with values for each type of compression surely?

I suppose this raises the issue of what happens if someone fixes that.
They'll now have to update pg_waldump *and* pg_walinspect. I don't
think that would actually be a lot of work but it's definitely more
than just one. Also, perhaps they should be in the same contrib
directory so at least people won't forget there are two.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Gustafsson 2022-01-31 21:48:30 Re: Support for NSS as a libpq TLS backend
Previous Message Tom Lane 2022-01-31 21:39:01 Re: Support tab completion for upper character inputs in psql