Re: Combine Prune and Freeze records emitted by vacuum

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: Melanie Plageman <melanieplageman(at)gmail(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>
Subject: Re: Combine Prune and Freeze records emitted by vacuum
Date: 2024-04-01 17:37:46
Message-ID: a065b0a9-dabf-4f99-8ba4-9271995b7c3e@iki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 01/04/2024 19:08, Melanie Plageman wrote:
> On Mon, Apr 01, 2024 at 05:17:51PM +0300, Heikki Linnakangas wrote:
>> diff --git a/src/backend/access/heap/pruneheap.c b/src/backend/access/heap/pruneheap.c
>> @@ -256,15 +270,16 @@ heap_page_prune(Relation relation, Buffer buffer,
>> tup.t_tableOid = RelationGetRelid(relation);
>>
>> /*
>> - * Determine HTSV for all tuples.
>> + * Determine HTSV for all tuples, and queue them up for processing as HOT
>> + * chain roots or as a heap-only items.
>
> Reading this comment now as a whole, I would add something like
> "Determining HTSV for all tuples once is required for correctness" to
> the start of the second paragraph. The new conjunction on the first
> paragraph sentence followed by the next paragraph is a bit confusing
> because it sounds like queuing them up for processing is required for
> correctness (which, I suppose it is on some level). Basically, I'm just
> saying that it is now less clear what is required for correctness.

Fixed.

> While I recognize this is a matter of style and not important, I
> personally prefer this for reverse looping:
>
> for (int i = prstate.nheaponly_items; i --> 0;)

I don't think we use that style anywhere in the Postgres source tree
currently. (And I don't like it ;-) )

> I do think a comment about the reverse order would be nice. I know it
> says something above the first loop to this effect:
>
> * Processing the items in reverse order (and thus the tuples in
> * increasing order) increases prefetching efficiency significantly /
> * decreases the number of cache misses.
>
> So perhaps we could just say "as above, process the items in reverse
> order"

I'm actually not sure why it makes a difference. I would assume all the
data to already be in CPU cache at this point, since the first loop
already accessed it, so I think there's something else going on. But I
didn't investigate it deeper. Anyway, added a comment.

Committed the first of the remaining patches with those changes. And
also this, which is worth calling out:

> if (prstate.htsv[offnum] == HEAPTUPLE_DEAD)
> {
> ItemId itemid = PageGetItemId(page, offnum);
> HeapTupleHeader htup = (HeapTupleHeader) PageGetItem(page, itemid);
>
> if (likely(!HeapTupleHeaderIsHotUpdated(htup)))
> {
> HeapTupleHeaderAdvanceConflictHorizon(htup,
> &prstate.latest_xid_removed);
> heap_prune_record_unused(&prstate, offnum, true);
> }
> else
> {
> /*
> * This tuple should've been processed and removed as part of
> * a HOT chain, so something's wrong. To preserve evidence,
> * we don't dare to remove it. We cannot leave behind a DEAD
> * tuple either, because that will cause VACUUM to error out.
> * Throwing an error with a distinct error message seems like
> * the least bad option.
> */
> elog(ERROR, "dead heap-only tuple (%u, %d) is not linked to from any HOT chain",
> blockno, offnum);
> }
> }
> else
> heap_prune_record_unchanged_lp_normal(page, &prstate, offnum);

As you can see, I turned that into a hard error. Previously, that code
was at the top of heap_prune_chain(), and it was normal to see DEAD
heap-only tuples there, because they would presumably get processed
later as part of a HOT chain. But now this is done after all the HOT
chains have already been processed.

Previously if there was a dead heap-only tuple like that on the page for
some reason, it was silently not processed by heap_prune_chain()
(because it was assumed that it would be processed later as part of a
HOT chain), and was left behind as a HEAPTUPLE_DEAD tuple. If the
pruning was done as part of VACUUM, VACUUM would fail with "ERROR:
unexpected HeapTupleSatisfiesVacuum result". Or am I missing something?

Now you get that above error also on on-access pruning, which is not
ideal. But I don't remember hearing about corruption like that, and
you'd get the error on VACUUM anyway.

With the next patches, heap_prune_record_unchanged() will do more, and
will also throw an error on a HEAPTUPLE_LIVE tuple, so even though in
the first patch we could print just a WARNING and move on, it gets more
awkward with the rest of the patches.

(Continuing with the remaining patches..)

--
Heikki Linnakangas
Neon (https://neon.tech)

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Davis 2024-04-01 17:39:10 Re: Statistics Import and Export
Previous Message Bruce Momjian 2024-04-01 17:33:28 Re: Statistics Import and Export