Re: Eliminate redundant tuple visibility check in vacuum

From: Melanie Plageman <melanieplageman(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, David Geier <geidav(dot)pg(at)gmail(dot)com>
Subject: Re: Eliminate redundant tuple visibility check in vacuum
Date: 2023-09-28 12:46:46
Message-ID: CAAKRu_aUQcRvrt=ka4qCMJVRjGUvLCJjiiu_6iRQAq2mb9LOHA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Sep 21, 2023 at 3:53 PM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> On Wed, Sep 13, 2023 at 1:29 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
>
> > > /*
> > > * The criteria for counting a tuple as live in this block need to
> > > @@ -1682,7 +1664,7 @@ retry:
> > > * (Cases where we bypass index vacuuming will violate this optimistic
> > > * assumption, but the overall impact of that should be negligible.)
> > > */
> > > - switch (res)
> > > + switch ((HTSV_Result) presult.htsv[offnum])
> > > {
> > > case HEAPTUPLE_LIVE:
> >
> > I think we should assert that we have a valid HTSV_Result here, i.e. not
> > -1. You could wrap the cast and Assert into an inline funciton as well.
>
> This isn't a bad idea, although I don't find it completely necessary either.

Attached v5 does this. Even though a value of -1 would hit the default
switch case and error out, I can see the argument for this validation
-- as all other places switching on an HTSV_Result are doing so on a
value which was always an HTSV_Result.

Once I started writing the function comment, however, I felt a bit
awkward. In order to make the function available to both pruneheap.c
and vacuumlazy.c, I had to put it in a header file. Writing a
function, available to anyone including heapam.h, which takes an int
and returns an HTSV_Result feels a bit odd. Do we want it to be common
practice to use an int value outside the valid enum range to store
"status not yet computed" for HTSV_Results?

Anyway, on a tactical note, I added the inline function to heapam.h
below the PruneResult definition since it is fairly tightly coupled to
the htsv array in PruneResult. All of the function prototypes are
under a comment that says "function prototypes for heap access method"
-- which didn't feel like an accurate description of this function. I
wonder if it makes sense to have pruneheap.c include vacuum.h and move
pruning specific stuff like this helper and PruneResult over there? I
can't remember why I didn't do this before, but maybe there is a
reason not to? I also wasn't sure if I needed to forward declare the
inline function or not.

Oh, and, one more note. I've dropped the former patch 0001 which
changed the function comment about off_loc above heap_page_prune(). I
have plans to write a separate patch adding an error context callback
for HOT pruning with the offset number and would include such a change
in that patch.

- Melanie

Attachment Content-Type Size
v5-0002-Reuse-heap_page_prune-tuple-visibility-statuses.patch text/x-patch 11.3 KB
v5-0001-Move-heap_page_prune-output-parameters-into-struc.patch text/x-patch 7.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Drouvot, Bertrand 2023-09-28 13:01:42 Re: Synchronizing slots from primary to standby
Previous Message Zhijie Hou (Fujitsu) 2023-09-28 12:38:20 RE: [PoC] pg_upgrade: allow to upgrade publisher node