Re: Emit fewer vacuum records by reaping removable tuples during pruning

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Melanie Plageman <melanieplageman(at)gmail(dot)com>
Cc: Peter Geoghegan <pg(at)bowt(dot)ie>, Michael Paquier <michael(at)paquier(dot)xyz>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: Emit fewer vacuum records by reaping removable tuples during pruning
Date: 2024-01-24 19:59:44
Message-ID: CA+Tgmobc2FcBF0J-ASNYWmgorsxw8-HzBXHyRyusNhCgvR3RCg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jan 18, 2024 at 9:23 PM Melanie Plageman
<melanieplageman(at)gmail(dot)com> wrote:
> I have attached a rebased version of the former 0004 as v11-0001.

This looks correct to me, although I wouldn't mind some more eyes on
it. However, I think that the comments still need more work.

Specifically:

/*
* Prune, freeze, and count tuples.
*
* Accumulates details of remaining LP_DEAD line pointers on page in
* dead_items array. This includes LP_DEAD line pointers that we
* pruned ourselves, as well as existing LP_DEAD line pointers that
* were pruned some time earlier. Also considers freezing XIDs in the
* tuple headers of remaining items with storage. It also determines
* if truncating this block is safe.
*/
- lazy_scan_prune(vacrel, buf, blkno, page,
- vmbuffer, all_visible_according_to_vm,
- &has_lpdead_items);
+ if (got_cleanup_lock)
+ lazy_scan_prune(vacrel, buf, blkno, page,
+ vmbuffer, all_visible_according_to_vm,
+ &has_lpdead_items);

I think this comment needs adjusting. Possibly, the preceding calls to
lazy_scan_noprune() and lazy_scan_new_or_empty() could even use a bit
better comments, but in those cases, you're basically keeping the same
code with the same comment, so it's kinda defensible. Here, though,
you're making the call conditional without any comment update.

/*
* Final steps for block: drop cleanup lock, record free space in the
* FSM.
*
* If we will likely do index vacuuming, wait until
* lazy_vacuum_heap_rel() to save free space. This doesn't just save
* us some cycles; it also allows us to record any additional free
* space that lazy_vacuum_heap_page() will make available in cases
* where it's possible to truncate the page's line pointer array.
*
+ * Our goal is to update the freespace map the last time we touch the
+ * page. If the relation has no indexes, or if index vacuuming is
+ * disabled, there will be no second heap pass; if this particular
+ * page has no dead items, the second heap pass will not touch this
+ * page. So, in those cases, update the FSM now.
+ *
* Note: It's not in fact 100% certain that we really will call
* lazy_vacuum_heap_rel() -- lazy_vacuum() might yet opt to skip index
* vacuuming (and so must skip heap vacuuming). This is deemed okay
* because it only happens in emergencies, or when there is very
* little free space anyway. (Besides, we start recording free space
* in the FSM once index vacuuming has been abandoned.)
*/

I think this comment needs a rewrite, not just sticking the other
comment in the middle of it. There's some duplication between these
two comments, and merging it all together should iron that out.
Personally, I think my comment (which was there before, this commit
only moves it here) is clearer than what's already here about the
intent, but it's lacking some details that are captured in the other
two paragraphs, and we probably don't want to lose those details.

If you'd like, I can try rewriting these comments to my satisfaction
and you can reverse-review the result. Or you can rewrite them and
I'll re-review the result. But I think this needs to be a little less
mechanical. It's not just about shuffling all the comments around so
that all the text ends up somewhere -- we also need to consider the
degree to which the meaning becomes duplicated when it all gets merged
together.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2024-01-24 20:05:12 Re: s_lock_test no longer works
Previous Message Kirk Wolak 2024-01-24 19:50:52 Re: Oom on temp (un-analyzed table caused by JIT) V16.1 [ NOT Fixed ]