Re: Show various offset arrays for heap WAL records

From: Melanie Plageman <melanieplageman(at)gmail(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Show various offset arrays for heap WAL records
Date: 2023-01-27 17:24:07
Message-ID: CAAKRu_ZLLHKbNyKxHs5Ax6+OO0pxHL5Hq__fC7rPFBjFYhJdFg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

I have taken a stab at doing some of the tasks listed in this email.

I have made the new files rmgr_utils.c/h.

I have come up with a standard format that I like for the output and
used it in all the heap record types.

Examples below:

snapshotConflictHorizon: 2184, nplans: 2, plans [ { xmax: 0, infomask:
2816, infomask2: 2, ntuples: 5, offsets: [ 10, 11, 12, 18, 71 ] }, {
xmax: 0, infomask: 11008, infomask2: 2, ntuples: 2, offsets: [ 72, 73
] } ]

snapshotConflictHorizon: 2199, nredirected: 4, ndead: 0, nunused: 4,
redirected: [ 1->38, 2->39, 3->40, 4->41 ], dead: [], unused: [ 24,
25, 26, 27, 37 ]

I started documenting it in the rmgr_utils.h header file in a comment,
however it may be worth a README?

I haven't polished this description of the format (or added examples,
etc) or used it in the btree-related functions because I assume the
format and helper function API will need more discussion.

This is still a rough draft, as I anticipate changes will be requested.
I would split it into multiple patches, etc. But I am looking for
feedback on the suggested format and the array formatting helper
function API.

Perhaps there should also be example output of the offset arrays in
pgwalinspect docs?

I've changed the array format helper functions that Andres added to be a
single function with an additional layer of indirection so that any
record with an array can use it regardless of type and format of the
individual elements. The signature is based somewhat off of qsort_r()
and allows the user to pass a function with the the desired format of
the elements.

On a semi-unrelated note, I think it might be nice to have a comment in
heapam_xlog.h about what the infobits fields actually are and why they
exist -- e.g. we only need a subset of infomask[2] bits in these
records.
I put a random comment in the code where I think it should go.
I will delete it later, of course.

On Mon, Jan 9, 2023 at 11:00 PM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
>
> On Mon, Jan 9, 2023 at 1:58 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> > The attached patch adds details to XLOG_HEAP2_PRUNE, XLOG_HEAP2_VACUUM,
> > XLOG_HEAP2_FREEZE_PAGE.
>
> I'm bound to end up doing the same in index access methods. Might make
> sense for the utility routines to live somewhere more centralized, at
> least when code reuse is likely. Practically every index AM has WAL
> records that include a sorted page offset number array, just like
> these ones. It's a very standard thing, obviously.

I plan to add these if the format and API I suggested seems like the
right direction.

On Tue, Jan 10, 2023 at 2:35 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
>
> > > I chose to include infomask[2] for the different freeze plans mainly because
> > > it looks odd to see different plans without a visible reason. But I'm not sure
> > > that's the right choice.
> >
> > I don't think that it is particularly necessary to do so in order for
> > the output to make sense -- pg_waldump is inherently a tool for
> > experts. What it comes down to for me is whether or not this
> > information is sufficiently useful to display, and/or can be (or needs
> > to be) controlled via some kind of verbosity knob.
>
> It seemed useful enough to me, but I likely also stare more at this stuff than
> most. Compared to the list of offsets it's not that much content.
>

Personally, I like having the infomasks for the freeze plans. If we
someday have a more structured input to rmgr_desc, we could then easily
have them in their own column and use functions like
heap_tuple_infomask_flags() on them.

> > How hard would it be to invent a general mechanism to control the verbosity
> > of what we'll show for each WAL record?
>
> Nontrivial, I'm afraid. We don't pass any relevant parameters to rm_desc:
> void (*rm_desc) (StringInfo buf, XLogReaderState *record);
>
> so we'd need to patch all of them. That might be worth doing at some point,
> but I don't want to tackle it right now.

In terms of a more structured format, it seems like it would make the
most sense to pass a JSON or composite datatype structure to rm_desc
instead of that StringInfo.

I would also like to see functions like XLogRecGetBlockRefInfo() pass
something more useful than a stringinfo buffer so that we could easily
extract out the relfilenode in pgwalinspect.

On Mon, Jan 16, 2023 at 10:09 PM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
>
> On Wed, Jan 11, 2023 at 3:11 PM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> > On Wed, Jan 11, 2023 at 3:00 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> > > What are your thoughts about the place for the helper functions? You're ok
> > > with rmgrdesc_utils.[ch]?
> >
> > Yeah, that seems okay.
>
> BTW, while playing around with this patch today, I noticed that it
> won't display the number of elements in each offset array directly.
> Perhaps it's worth including that, too?

I believe I have addressed this in the attached patch.

- Melanie

Attachment Content-Type Size
v1-0001-Add-rmgr_desc-utilities.patch application/octet-stream 16.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2023-01-27 17:53:58 Re: New strategies for freezing, advancing relfrozenxid early
Previous Message Justin Pryzby 2023-01-27 17:23:20 Re: Add LZ4 compression in pg_dump