Re: Add pg_walinspect function with block info columns

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Add pg_walinspect function with block info columns
Date: 2023-03-18 00:51:05
Message-ID: ZBUK+V5TWPB2hKZz@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Mar 17, 2023 at 04:36:58PM -0700, Peter Geoghegan wrote:
> I'm sure that they will do that much more than they would have
> otherwise. Since we'll have made pg_get_wal_block_info() so much more
> useful than pg_get_wal_records_info() for many important use cases.
> Why is that a bad thing? Are you concerned about the overhead of
> pulling in FPIs when pg_get_wal_block_info() is run, if Bharath's
> patch is committed? That could be a problem, I suppose -- but it would
> be good to get more data on that. Do you think that this will be much
> of an issue, Bharath?

Yes. The CPU cost is one thing, but I am also worrying about the
I/O cost with a tuplestore spilling to disk a large number of FPIs,
and some workloads can generate WAL so as FPIs is what makes for most
of the contents stored in the WAL. (wal_compression is very effective
in such cases, for example.)

It is true that it is possible to tweak SQLs that exactly do that with
a large amount of data materialized, or just eat so much CPU that they
basically DoS the backend. Still I'd rather keep a minimalistic
design for each function with block_info having only one field able to
track back to which record a block information refers to, and I'd like
to think one able to look at WAL internals will be smart enough to
write SQL in such a way that they avoid that on a production machine.
The current design allows to do that in this view, but that's just one
way I see how to represent structures at SQL level. Extending
block_info() with more record-level attributes allows that as well,
still it bloats its interface unnecessarily. Designing software is
hard, and it looks like our point of view on that is different. If
you wish to change the current interface of block_info, feel free to
do so. It does not mean that it cannot be changed, just that I
recommend not to do that, and that's just one opinion.

This said, your point about having rec_blk_ref reported as an empty
string rather than NULL if there are no block references does not feel
natural to me, either.. Reporting NULL would be better.
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2023-03-18 00:58:43 Re: meson issue? ninja clean doesn't drop queryjumblefuncs.funcs.c
Previous Message Andres Freund 2023-03-17 23:54:27 Re: gcc 13 warnings