Re: Eliminate redundant tuple visibility check in vacuum

From: Melanie Plageman <melanieplageman(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, David Geier <geidav(dot)pg(at)gmail(dot)com>
Subject: Re: Eliminate redundant tuple visibility check in vacuum
Date: 2023-08-31 22:29:29
Message-ID: CAAKRu_aM43xZztW8=_Ot-zX=pNAm7c-WBULesUWjXr3v9k_2yw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Aug 31, 2023 at 2:03 PM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> I have a few suggestions:
>
> - Rather than removing the rather large comment block at the top of
> lazy_scan_prune() I'd like to see it rewritten to explain how we now
> deal with the problem. I'd suggest leaving the first paragraph ("Prior
> to...") just as it is and replace all the words following "The
> approach we take now is" with a description of the approach that this
> patch takes to the problem.

Good idea. I've updated the comment. I also explain why this new
approach works in the commit message and reference the commit which
added the previous approach.

> - I'm not a huge fan of the caller of heap_page_prune() having to know
> how to initialize the PruneResult. Can heap_page_prune() itself do
> that work, so that presult is an out parameter rather than an in-out
> parameter? Or, if not, can it be moved to a new helper function, like
> heap_page_prune_init(), rather than having that logic in 2+ places?

Ah, yes. Now that it has two callers, and since it is exclusively an
output parameter, it is quite ugly to initialize it in both callers.
Fixed in the attached.

> - int ndeleted,
> - nnewlpdead;
> -
> - ndeleted = heap_page_prune(relation, buffer, vistest, limited_xmin,
> - limited_ts, &nnewlpdead, NULL);
> + int ndeleted = heap_page_prune(relation, buffer, vistest, limited_xmin,
> + limited_ts, &presult, NULL);
>
> - I don't particularly like merging the declaration with the
> assignment unless the call is narrow enough that we don't need a line
> break in there, which is not the case here.

I have changed this.

> - I haven't thoroughly investigated yet, but I wonder if there are any
> other places where comments need updating. As a general note, I find
> it desirable for a function's header comment to mention whether any
> pointer parameters are in parameters, out parameters, or in-out
> parameters, and what the contract between caller and callee actually
> is.

I've investigated vacuumlazy.c and pruneheap.c and looked at the
commit that added the retry loop (8523492d4e349) to see everywhere it
added comments and don't see anywhere else that needs updating.

I have updated lazy_scan_prune()'s function header comment to describe
the nature of the in-out and output parameters and the contract.

- Melanie

Attachment Content-Type Size
v2-0001-Rebrand-LVPagePruneState-as-PruneResult.patch text/x-patch 15.1 KB
v2-0002-Reuse-heap_page_prune-tuple-visibility-statuses.patch text/x-patch 15.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Melanie Plageman 2023-08-31 22:35:19 Re: Eliminate redundant tuple visibility check in vacuum
Previous Message Jim Jones 2023-08-31 22:01:37 Show inline comments from pg_hba lines in the pg_hba_file_rules view