Re: Emit fewer vacuum records by reaping removable tuples during pruning

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Melanie Plageman <melanieplageman(at)gmail(dot)com>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Peter Geoghegan <pg(at)bowt(dot)ie>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: Emit fewer vacuum records by reaping removable tuples during pruning
Date: 2024-01-05 13:59:41
Message-ID: CA+Tgmoa8vmYTrTUc1iwbLDut=BnqsDZc3oA335Yasqc=aSM9EA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jan 4, 2024 at 6:03 PM Melanie Plageman
<melanieplageman(at)gmail(dot)com> wrote:
> When a single page is being processed, page pruning happens in
> heap_page_prune(). Freezing, dead items recording, and visibility
> checks happen in lazy_scan_prune(). Visibility map updates and
> freespace map updates happen back in lazy_scan_heap(). Except, if the
> table has no indexes, in which case, lazy_scan_heap() also invokes
> lazy_vacuum_heap_page() to set dead line pointers unused and do
> another separate visibility check and VM update. I maintain that all
> page-level processing should be done in the page-level processing
> functions (like lazy_scan_prune()). And lazy_scan_heap() shouldn't be
> directly responsible for special case page-level processing.

But you can just as easily turn this argument on its head, can't you?
In general, except for HOT tuples, line pointers are marked dead by
pruning and unused by vacuum. Here you want to turn it on its head and
make pruning do what would normally be vacuum's responsibility.

I mean, that's not to say that your argument is "wrong" ... but what I
just said really is how I think about it, too.

> > Also, I find "pronto_reap" to be a poor choice of name. "pronto" is an
> > informal word that seems to have no advantage over something like
> > "immediate" or "now," and I don't think "reap" has a precise,
> > universally-understood meaning. You could call this "mark_unused_now"
> > or "immediately_mark_unused" or something and it would be far more
> > self-documenting, IMHO.
>
> Yes, I see how pronto is unnecessarily informal. If there are no cases
> other than when the table has no indexes that we would consider
> immediately marking LPs unused, then perhaps it is better to call it
> "no_indexes" (per andres' suggestion)?

wfm.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2024-01-05 14:37:21 cleanup patches for incremental backup
Previous Message Robert Haas 2024-01-05 13:51:53 Re: the s_lock_stuck on perform_spin_delay