Re: Add pg_walinspect function with block info columns

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc: pg(at)bowt(dot)ie, michael(at)paquier(dot)xyz, melanieplageman(at)gmail(dot)com, boekewurm+postgres(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Add pg_walinspect function with block info columns
Date: 2023-03-23 17:24:40
Message-ID: CALj2ACXtpXabi=YhoFPGc3AaWzYNcEua2MsD4yw=TVntbHjT7Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Mar 20, 2023 at 8:51 AM Kyotaro Horiguchi
<horikyota(dot)ntt(at)gmail(dot)com> wrote:
>
> + /* Get block references, if any, otherwise continue. */
> + if (!XLogRecHasAnyBlockRefs(xlogreader))
>
> code does". I feel we don't need the a comment there.

Removed.

> This means GetWALBlockInfo overwrites the last two columns generated
> by GetWalRecordInfo, but I don't think this approach is clean and
> stable. I agree we don't want the final columns in a block info tuple
> but we don't want to duplicate the common code path.
>
> I initially thought we could devide the function into
> GetWALCommonInfo(), GetWALRecordInfo() and GetWALBlockInfo(), but it
> doesn't seem that simple.. In the end, I think we should have separate
> GetWALRecordInfo() and GetWALBlockInfo() that have duplicate
> "values[i++] = .." lines.

Done as per Peter's suggestion (keeping primary key columns first and
having a bit of code duplicated instead of making it complex in the
name of deduplication). Please see the attached v4 patch set.

On Tue, Mar 21, 2023 at 5:04 AM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
>
> On Sun, Mar 19, 2023 at 8:21 PM Kyotaro Horiguchi
> <horikyota(dot)ntt(at)gmail(dot)com> wrote:
> > The documentation has just one section titled "General Functions"
> > which directly contains detailed explation of four functions, making
> > it hard to get clear understanding of the available functions. I
> > considered breaking it down into a few subsections, but that wouldn't
> > look great since most of them would only contain one function.
> > However, I feel it would be helpful to add a list of all functions at
> > the beginning of the section.
>
> I like the idea of sections, even if there is only one function per
> section in some cases.

Hm, -1 for now. Most of the extensions that I have seen don't have
anything like that. If needed, someone can start a separate thread for
such a proposal for all of the extensions.

> I also think that we should add a "Tip" that advises users that they
> may use an "end LSN" that is the largest possible LSN,
> 'FFFFFFFF/FFFFFFFF' to get information about records up until the
> current LSN of the cluster (per commit 5c1b6628).
>
> Is there a straightforward way to get a usable LSN constant for this
> purpose? The simplest way I could come up with quickly is "SELECT
> pg_lsn(2^64.-1)" -- which still isn't very simple. Actually, it might
> be even worse than 'FFFFFFFF/FFFFFFFF', so perhaps we should just use
> that in the docs new "Tip".

Done.

> > I agree that adding a note about the characteristics would helpful to
> > avoid the misuse of pg_get_wal_block_info(). How about something like,
> > "Note that pg_get_wal_block_info() omits records that contains no
> > block references."?
>
> This should be a strict invariant. In other words, it should be part
> of the documented contract of pg_get_wal_block_info and
> pg_get_wal_records_info. The two functions should be defined in terms
> of each other. Their relationship is important.
>
> Users should be able to safely assume that the records that have a
> NULL block_ref according to pg_get_wal_records_info are *precisely*
> those records that won't have any entries within pg_get_wal_block_info
> (assuming that the same LSN range is used with both functions).
> pg_walinspect should explicitly promise this, and promise the
> corollary condition around non-NULL block_ref records. It is a useful
> promise from the point of view of users. It also makes it easier to
> understand what's really going on here without any ambiguity.
>
> I don't completely disagree with Michael about the redundancy. I just
> think that it's worth it on performance grounds. We might want to say
> that directly in the docs, too.

Added a note in the docs.

> Also, if GetWALBlockInfo() is now supposed to only be called when
> XLogRecHasAnyBlockRefs() now then it should probably have an assertion
> to verify the precondition.

Done.

> > I initially thought we could devide the function into
> > GetWALCommonInfo(), GetWALRecordInfo() and GetWALBlockInfo(), but it
> > doesn't seem that simple.. In the end, I think we should have separate
> > GetWALRecordInfo() and GetWALBlockInfo() that have duplicate
> > "values[i++] = .." lines.
>
> I agree. A little redundancy is better when the alternative is fragile
> code, and I'm pretty sure that that applies here -- there won't be
> very many duplicated lines, and the final code will be significantly
> clearer. There can be a comment about keeping GetWALRecordInfo and
> GetWALBlockInfo in sync.

Done.

On Tue, Mar 21, 2023 at 5:21 AM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
>
> The new pg_get_wal_block_info outputs columns in an order that doesn't
> seem like the most useful possible order to me. This gives us another
> reason to have separate GetWALRecordInfo and GetWALBlockInfo utility
> functions rather than sharing logic for building output tuples.
>
> Specifically, I think that pg_get_wal_block_info should ouput the
> "primary key" columns first:
>
> reltablespace, reldatabase, relfilenode, blockid, start_lsn, end_lsn
>
> Next comes the columns that duplicate the columns output by
> pg_get_wal_records_info, in the same order as they appear in
> pg_get_wal_records_info. (Obviously this won't include block_ref).

Done.

On Tue, Mar 21, 2023 at 5:30 AM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
>
> I think that we should also make the description output column display
> NULLs for those records that don't output any description string. This
> at least includes the "FPI" record type from the "XLOG" rmgr.
> Alternatively, we could find a way of making it show a description.

Done.

Please see the attached v4 patch set addressing all the review comments.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment Content-Type Size
v4-0001-Few-optimizations-in-pg_walinspect.patch application/x-patch 3.8 KB
v4-0002-Emit-WAL-record-info-via-pg_get_wal_block_info.patch application/x-patch 12.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2023-03-23 17:26:07 Re: HOT chain validation in verify_heapam()
Previous Message Jeff Davis 2023-03-23 17:16:45 Re: ICU locale validation / canonicalization