Re: pg_walinspect - a new extension to get raw WAL data and WAL stats

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Greg Stark <stark(at)mit(dot)edu>, Jeremy Schneider <schneider(at)ardentperf(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, SATYANARAYANA NARLAPURAM <satyanarlapuram(at)gmail(dot)com>, marvin_liang(at)qq(dot)com, actyzhang(at)outlook(dot)com, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>
Subject: Re: pg_walinspect - a new extension to get raw WAL data and WAL stats
Date: 2022-02-10 02:26:13
Message-ID: CALj2ACUS9+54QGPtUjk76dcYW-AMKp3hPe-U+pQo2-GpE4kjtA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Feb 6, 2022 at 9:15 PM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> On Mon, Jan 31, 2022 at 4:40 PM Greg Stark <stark(at)mit(dot)edu> wrote:
> > So I looked at this patch and I have the same basic question as Bruce.
> > Do we really want to expose every binary tool associated with Postgres
> > as an extension? Obviously this is tempting for cloud provider users
> > which is not an unreasonable argument. But it does have consequences.
> >
> > 1) Some things like pg_waldump are running code that is not normally
> > under user control. This could have security issues or reliability
> > issues.
>
> For what it's worth, I am generally in favor of having something like
> this in PostgreSQL. I think it's wrong of us to continue assuming that
> everyone has command-line access. Even when that's true, it's not
> necessarily convenient. If you choose to use a relational database,
> you may be the sort of person who likes SQL. And if you are, you may
> want to have the database tell you what's going on via SQL rather than
> command-line tools or operating system utilities. Imagine if we didn't
> have pg_stat_activity and you had to get that information by running a
> separate binary. Would anyone like that? Why is this case any
> different?
>
> A few years ago we exposed data from pg_control via SQL and similar
> concerns were raised - but it turns out to be pretty useful. I don't
> know why this shouldn't be equally useful. Sure, there's some
> duplication in functionality, but it's not a huge maintenance burden
> for the project, and people (including me) like having it available. I
> think the same things will be true here.
>
> If decoding WAL causes security problems, that's something we better
> fix, because WAL is constantly decoded on standbys and via logical
> decoding on systems all over the place. I agree that we can't let
> users supply their own hand-crafted WAL records to be decoded without
> causing more trouble than we can handle, but if it's not safe to
> decode the WAL the system generated than we are in a lot of trouble
> already.
>
> I hasten to say that I'm not endorsing every detail or indeed any
> detail of the proposed patch, and some of the concerns you mention
> later sound well-founded to me. But I disagree with the idea that we
> shouldn't have both a command-line utility that roots through files on
> disk and an SQL interface that works with a running system.

Thanks Robert for your comments.

> + appendStringInfo(&rec_blk_ref, "blkref #%u: rel %u/%u/%u fork %s blk %u",
>
> But that's kind of out of place for an SQL interface. It makes it hard
> to write queries since things like the relid, block number etc are in
> the string. If I wanted to use these functions I would expect to be
> doing something like "select all the decoded records pertaining to
> block n".

Thanks Greg for your review of the patches. Since there can be
multiple blkref for WAL record type HEAP2 (for multi inserts
basically) [1], I couldn't find a better way to break it and represent
it as a non-text column. IMO this is simpler and users can easily find
out answers to "how many WAL records my relation generated between
lsn1 and lsn2 or how many WAL records of type Heap exist and so on?",
see [2]. I've also added a test case to just show this in 0002 patch.

Here's the v4 patch set that has the following changes along with
Greg's review comments addressed:

1) Added documentation as 0003 patch.
2) Removed CHECKPOINT commands from tests as it is unnecessary.
3) Added input validation code and tests.
4) A few more comments have been added.
5) Currently, only superusers can create the extension, but users with
the pg_monitor role can use the functions.
6) Test cases are basic yet they cover all the functions, error cases
with input validations, I don't think we need to add many more test
cases as suggested upthread, but I'm open to add a few more if I miss
any use-case.

Please review the v4 patch set further and let me know your thoughts.

[1]
rmgr: Heap2 len (rec/tot): 64/ 8256, tx: 0, lsn:
0/014A9070, prev 0/014A8FF8, desc: VISIBLE cutoff xid 709 flags 0x01,
blkref #0: rel 1663/12757/16384 fork vm blk 0 FPW, blkref #1: rel
1663/12757/16384 blk 0
rmgr: Heap2 len (rec/tot): 59/ 59, tx: 0, lsn:
0/014AB0C8, prev 0/014A9070, desc: VISIBLE cutoff xid 709 flags 0x01,
blkref #0: rel 1663/12757/16384 fork vm blk 0, blkref #1: rel
1663/12757/16384 blk 1
rmgr: Heap2 len (rec/tot): 59/ 59, tx: 0, lsn:
0/014AB108, prev 0/014AB0C8, desc: VISIBLE cutoff xid 709 flags 0x01,
blkref #0: rel 1663/12757/16384 fork vm blk 0, blkref #1: rel
1663/12757/16384 blk 2
rmgr: Heap2 len (rec/tot): 59/ 59, tx: 0, lsn:
0/014AB148, prev 0/014AB108, desc: VISIBLE cutoff xid 709 flags 0x01,
blkref #0: rel 1663/12757/16384 fork vm blk 0, blkref #1: rel
1663/12757/16384 blk 3
rmgr: Heap2 len (rec/tot): 59/ 59, tx: 0, lsn:
0/014AB188, prev 0/014AB148, desc: VISIBLE cutoff xid 709 flags 0x01,
blkref #0: rel 1663/12757/16384 fork vm blk 0, blkref #1: rel
1663/12757/16384 blk 4

[2]
postgres=# select count(*) from pg_get_wal_records_info('0/13C0A98',
'0/0157A160') where block_ref like '%16384%' and rmgr like 'Heap';
count
-------
10100
(1 row)

postgres=# select count(*) from t1;
count
-------
10100
(1 row)

postgres=#

postgres=# select count(*) from pg_get_wal_records_info('0/13C0A98',
'0/0157A160') where block_ref like '%FPW%';
count
-------
78
(1 row)

postgres=#

Regards,
Bharath Rupireddy.

Attachment Content-Type Size
v4-0001-pg_walinspect.patch application/octet-stream 34.7 KB
v4-0002-pg_walinspect-tests.patch application/octet-stream 6.5 KB
v4-0003-pg_walinspect-docs.patch application/octet-stream 5.8 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2022-02-10 02:34:41 Re: [RFC] building postgres with meson - perl embedding
Previous Message Andres Freund 2022-02-10 02:22:05 Logging in LockBufferForCleanup()