Re: Emit fewer vacuum records by reaping removable tuples during pruning

From: Andres Freund <andres(at)anarazel(dot)de>
To: Melanie Plageman <melanieplageman(at)gmail(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>
Subject: Re: Emit fewer vacuum records by reaping removable tuples during pruning
Date: 2024-01-04 20:03:31
Message-ID: 20240104200331.q45v4m63pisc2sqs@awork3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2023-11-17 18:12:08 -0500, Melanie Plageman wrote:
> diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
> index 14de8158d49..b578c32eeb6 100644
> --- a/src/backend/access/heap/heapam.c
> +++ b/src/backend/access/heap/heapam.c
> @@ -8803,8 +8803,13 @@ heap_xlog_prune(XLogReaderState *record)
> nunused = (end - nowunused);
> Assert(nunused >= 0);
>
> - /* Update all line pointers per the record, and repair fragmentation */
> - heap_page_prune_execute(buffer,
> + /*
> + * Update all line pointers per the record, and repair fragmentation.
> + * We always pass pronto_reap as true, because we don't know whether
> + * or not this option was used when pruning. This reduces the
> + * validation done on replay in an assert build.
> + */

Hm, that seems not great. Both because we loose validation and because it
seems to invite problems down the line, due to pronto_reap falsely being set
to true in heap_page_prune_execute().

> @@ -581,7 +589,17 @@ heap_prune_chain(Buffer buffer, OffsetNumber rootoffnum,
> * function.)
> */
> if (ItemIdIsDead(lp))
> + {
> + /*
> + * If the relation has no indexes, we can set dead line pointers
> + * LP_UNUSED now. We don't increment ndeleted here since the LP
> + * was already marked dead.
> + */
> + if (prstate->pronto_reap)
> + heap_prune_record_unused(prstate, offnum);
> +
> break;
> + }

I wasn't immediately sure whether this is reachable - but it is, e.g. after
on-access pruning (which currently doesn't yet use pronto reaping), after
pg_upgrade or dropping an index.

> Assert(ItemIdIsNormal(lp));
> htup = (HeapTupleHeader) PageGetItem(dp, lp);
> @@ -715,7 +733,17 @@ heap_prune_chain(Buffer buffer, OffsetNumber rootoffnum,
> * redirect the root to the correct chain member.
> */
> if (i >= nchain)
> - heap_prune_record_dead(prstate, rootoffnum);
> + {
> + /*
> + * If the relation has no indexes, we can remove dead tuples
> + * during pruning instead of marking their line pointers dead. Set
> + * this tuple's line pointer LP_UNUSED.
> + */
> + if (prstate->pronto_reap)
> + heap_prune_record_unused(prstate, rootoffnum);
> + else
> + heap_prune_record_dead(prstate, rootoffnum);
> + }
> else
> heap_prune_record_redirect(prstate, rootoffnum, chainitems[i]);
> }
> @@ -726,9 +754,12 @@ heap_prune_chain(Buffer buffer, OffsetNumber rootoffnum,
> * item. This can happen if the loop in heap_page_prune caused us to
> * visit the dead successor of a redirect item before visiting the
> * redirect item. We can clean up by setting the redirect item to
> - * DEAD state.
> + * DEAD state. If pronto_reap is true, we can set it LP_UNUSED now.
> */
> - heap_prune_record_dead(prstate, rootoffnum);
> + if (prstate->pronto_reap)
> + heap_prune_record_unused(prstate, rootoffnum);
> + else
> + heap_prune_record_dead(prstate, rootoffnum);
> }
>
> return ndeleted;

There's three new calls to heap_prune_record_unused() and the logic got more
nested. Is there a way to get to a nicer end result?

> From 608658f2cbc0acde55aac815c0fdb523ec24c452 Mon Sep 17 00:00:00 2001
> From: Melanie Plageman <melanieplageman(at)gmail(dot)com>
> Date: Mon, 13 Nov 2023 16:47:08 -0500
> Subject: [PATCH v2 1/2] Indicate rel truncation unsafe in lazy_scan[no]prune
>
> Both lazy_scan_prune() and lazy_scan_noprune() must determine whether or
> not there are tuples on the page making rel truncation unsafe.
> LVRelState->nonempty_pages is updated to reflect this. Previously, both
> functions set an output parameter or output parameter member, hastup, to
> indicate that nonempty_pages should be updated to reflect the latest
> non-removable page. There doesn't seem to be any reason to wait until
> lazy_scan_[no]prune() returns to update nonempty_pages. Plenty of other
> counters in the LVRelState are updated in lazy_scan_[no]prune().
> This allows us to get rid of the output parameter hastup.

> @@ -972,20 +970,21 @@ lazy_scan_heap(LVRelState *vacrel)
> continue;
> }
>
> - /* Collect LP_DEAD items in dead_items array, count tuples */
> - if (lazy_scan_noprune(vacrel, buf, blkno, page, &hastup,
> + /*
> + * Collect LP_DEAD items in dead_items array, count tuples,
> + * determine if rel truncation is safe
> + */
> + if (lazy_scan_noprune(vacrel, buf, blkno, page,
> &recordfreespace))
> {
> Size freespace = 0;
>
> /*
> * Processed page successfully (without cleanup lock) -- just
> - * need to perform rel truncation and FSM steps, much like the
> - * lazy_scan_prune case. Don't bother trying to match its
> - * visibility map setting steps, though.
> + * need to update the FSM, much like the lazy_scan_prune case.
> + * Don't bother trying to match its visibility map setting
> + * steps, though.
> */
> - if (hastup)
> - vacrel->nonempty_pages = blkno + 1;
> if (recordfreespace)
> freespace = PageGetHeapFreeSpace(page);
> UnlockReleaseBuffer(buf);

The comment continues to say that we "determine if rel truncation is safe" -
but I don't see that? Oh, I see, it's done inside lazy_scan_noprune(). This
doesn't seem like a clear improvement to me. Particularly because it's only
set if lazy_scan_noprune() actually does something.

I don't like the existing code in lazy_scan_heap(). But this kinda seems like
tinkering around the edges, without getting to the heart of the issue. I think
we should

1) Move everything after ReadBufferExtended() and the end of the loop into its
own function

2) All the code in the loop body after the call to lazy_scan_prune() is
specific to the lazy_scan_prune() path, it doesn't make sense that it's at
the same level as the the calls to lazy_scan_noprune(),
lazy_scan_new_or_empty() or lazy_scan_prune(). Either it should be in
lazy_scan_prune() or a new wrapper function.

3) It's imo wrong that we have UnlockReleaseBuffer() (there are 6 different
places unlocking if I didn't miscount!) and RecordPageWithFreeSpace() calls
in this many places. I think this is largely a consequence of the previous
points. Once those are addressed, we can have one common place.

But I digress.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2024-01-04 20:23:09 Re: Confine vacuum skip logic to lazy_scan_skip
Previous Message Andres Freund 2024-01-04 19:24:20 Re: Emit fewer vacuum records by reaping removable tuples during pruning