Re: Combine pg_walinspect till_end_of_wal functions with others

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
Cc: Julien Rouhaud <rjuju123(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Combine pg_walinspect till_end_of_wal functions with others
Date: 2023-03-13 06:56:41
Message-ID: ZA7JKVPEJ18fA6q1@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Mar 10, 2023 at 04:45:06PM +0530, Bharath Rupireddy wrote:
> Any comments on the attached v5 patch?

I have reviewed the patch, and found it pretty messy. The tests
should have been divided into their own patch, I think. This is
rather straight-forward once the six functions have their checks
grouped together. The result was pretty good, so I have begun by
applying that as 1f282c2. This also includes most of the refinements
you have proposed for the whitespaces in the tests. Note that we were
missing a few spots with the bound checks for the six functions, so
now the coverage should be full.

After that comes the rest of the patch, and I have found a couple of
mistakes.

- pg_get_wal_records_info(start_lsn pg_lsn, end_lsn pg_lsn)
+ pg_get_wal_records_info(start_lsn pg_lsn, end_lsn pg_lsn DEFAULT NULL)
returns setof record
[...]
- pg_get_wal_stats(start_lsn pg_lsn, end_lsn pg_lsn, per_record boolean DEFAULT false)
+ pg_get_wal_stats(start_lsn pg_lsn, end_lsn pg_lsn DEFAULT NULL, per_record boolean DEFAULT false)

This part of the documentation is now incorrect.

+-- Make sure checkpoints don't interfere with the test.
+SELECT 'init' FROM pg_create_physical_replication_slot('regress_pg_walinspect_slot', true, false);

Adding a physical slot is better for stability of course, but the test
also has to drop it or installcheck would cause an existing cluster to
have that still around. The third argument could be true, as well, so
as we'd use a temporary slot.

- If <replaceable>start_lsn</replaceable>
- or <replaceable>end_lsn</replaceable> are not yet available, the
- function will raise an error. For example:
+ If a future <replaceable>end_lsn</replaceable> (i.e. the LSN server
+ doesn't know about) is specified, it returns stats till end of WAL. It
+ will raise an error, if the server doesn't have WAL available at given
+ <replaceable>start_lsn</replaceable> or if the
+ <replaceable>start_lsn</replaceable> is in future or is past the
+ <replaceable>end_lsn</replaceable>. For example, usage of the function is
+ as follows:

Hmm. I would simplify that, and just mention that an error is raised
when the start LSN is available, without caring about the rest (valid
end LSN being past the current insert LSN, and error if start > end,
the second being obvious).

+ <note>
+ <para>
+ Note that <function>pg_get_wal_records_info_till_end_of_wal</function> and
+ <function>pg_get_wal_stats_till_end_of_wal</function> functions have been
+ removed in the <filename>pg_walinspect</filename> version
+ <literal>1.1</literal>. The same functionality can be achieved with
+ <function>pg_get_wal_records_info</function> and
+ <function>pg_get_wal_stats</function> functions by specifying a future
+ <replaceable>end_lsn</replaceable>. However, <function>till_end_of_wal</function>
+ functions will still work if the extension is installed explicitly with
+ version <literal>1.0</literal>.
+ </para>
+ </note>

Not convinced that this is necessary.

+ GetInputLSNs(fcinfo, &start_lsn, &end_lsn, till_end_of_wal);
+
+ stats_per_record = PG_GETARG_BOOL(2);

This code in GetWalStats() is incorrect.
pg_get_wal_stats_till_end_of_wal() has a stats_per_record, but as
*second* argument, so this would be broken.

+ GetInputLSNs(fcinfo, &start_lsn, &end_lsn, till_end_of_wal);

Coming from the last point, I think that this interface is confusing,
and actually incorrect. From what I can see, we should be doing what
~15 has by grepping the argument values within the main function
calls, and just pass them down to the internal routines GetWalStats()
and GetWALRecordsInfo().

-static bool
-IsFutureLSN(XLogRecPtr lsn, XLogRecPtr *curr_lsn)
+static XLogRecPtr
+GetCurrentLSN(void)

This wrapper is actually a good idea.

At the end, I am finishing with the attached. ValidateInputLSNs()
ought to be called, IMO, when the caller of the SQL functions can
directly specify an end_lsn. This means that there is no point to do
this check in the two till_end_* functions. This has as cost two
extra checks to make sure that the start_lsn is not higher than the
current LSN, but I am fine to live with that. It seemed rather
natural to me to let ValidateInputLSNs() do a refresh of the end_lsn
if it sees that it is higher than the current LSN. And if you look
closely, you will see that we only call *once* GetCurrentLSN() for
each function call, so the maths are more precise.

I have cleaned up the comments of the modules, while on it, as there
was not much value in copy-pasting how a function fails while there is
a centralized validation code path. The tests for the till_end()
functions have been moved to the test path where we install 1.0.

With all these cleanups done, there is less code than at the
beginning, which comes from the docs, so the current code does not
change in size:
7 files changed, 173 insertions(+), 206 deletions(-)
--
Michael

Attachment Content-Type Size
v6-0001-Rework-pg_walinspect-functions.patch text/x-diff 24.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Hayato Kuroda (Fujitsu) 2023-03-13 07:05:55 RE: [Proposal] Add foreign-server health checks infrastructure
Previous Message Peter Eisentraut 2023-03-13 06:39:52 Re: CI and test improvements