Re: Combine Prune and Freeze records emitted by vacuum

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: Melanie Plageman <melanieplageman(at)gmail(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>
Subject: Re: Combine Prune and Freeze records emitted by vacuum
Date: 2024-03-27 23:04:04
Message-ID: 68f85228-6955-4c2f-8ac8-1857e4b106cf@iki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 27/03/2024 20:26, Melanie Plageman wrote:
> On Wed, Mar 27, 2024 at 12:18 PM Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:
>>
>> On 27/03/2024 17:18, Melanie Plageman wrote:
>>> I need some way to modify the control flow or accounting such that I
>>> know which HEAPTUPLE_RECENTLY_DEAD tuples will not be marked LP_DEAD.
>>> And a way to consider freezing and do live tuple accounting for these
>>> and HEAPTUPLE_LIVE tuples exactly once.
>>
>> Just a quick update: I've been massaging this some more today, and I
>> think I'm onto got something palatable. I'll send an updated patch later
>> today, but the key is to note that for each item on the page, there is
>> one point where we determine the fate of the item, whether it's pruned
>> or not. That can happen in different points in in heap_page_prune().
>> That's also when we set marked[offnum] = true. Whenever that happens, we
>> all call one of the a heap_page_prune_record_*() subroutines. We already
>> have those subroutines for when a tuple is marked as dead or unused, but
>> let's add similar subroutines for the case that we're leaving the tuple
>> unchanged. If we move all the bookkeeping logic to those subroutines, we
>> can ensure that it gets done exactly once for each tuple, and at that
>> point we know what we are going to do to the tuple, so we can count it
>> correctly. So heap_prune_chain() decides what to do with each tuple, and
>> ensures that each tuple is marked only once, and the subroutines update
>> all the variables, add the item to the correct arrays etc. depending on
>> what we're doing with it.
>
> Yes, this would be ideal.

Well, that took me a lot longer than expected. My approach of "make sure
you all the right heap_prune_record_*() subroutine in all cases didn't
work out quite as easily as I thought. Because, as you pointed out, it's
difficult to know if a non-DEAD tuple that is part of a HOT chain will
be visited later as part of the chain processing, or needs to be counted
at the top of heap_prune_chain().

The solution I came up with is to add a third phase to pruning. At the
top of heap_prune_chain(), if we see a live heap-only tuple, and we're
not sure if it will be counted later as part of a HOT chain, we stash it
away and revisit it later, after processing all the hot chains. That's
somewhat similar to your 'counted' array, but not quite.

Attached is that approach, on top of v7. It's a bit messy, I made a
bunch of other changes too and didn't fully separate them out to
separate patch. Sorry about that.

One change with this is that live_tuples and many of the other fields
are now again updated, even if the caller doesn't need them. It was hard
to skip them in a way that would save any cycles, with the other
refactorings.

Some other notable changes are mentioned in the commit message.

> I was doing some experimentation with pageinspect today (trying to
> find that single place where live tuples fates are decided) and it
> seems like a heap-only tuple that is not HOT-updated will usually be
> the one at the end of the chain. Which seems like it would be covered
> by adding a record_live() type function call in the loop of
> heap_prune_chain():
>
> /*
> * If the tuple is not HOT-updated, then we are at the end of this
> * HOT-update chain.
> */
> if (!HeapTupleHeaderIsHotUpdated(htup))
> {
> heap_prune_record_live_or_recently_dead(dp, prstate,
> offnum, presult);
> break;
> }
>
> but that doesn't end up producing the same results as
>
> if (HeapTupleHeaderIsHeapOnly(htup)
> && !HeapTupleHeaderIsHotUpdated(htup) &&
> presult->htsv[rootoffnum] == HEAPTUPLE_DEAD)
> heap_prune_record_live_or_recently_dead(dp, prstate,
> offnum, presult);
>
> at the top of heap_prune_chain().

Yep, this is tricky, I also spent a lot of time trying to find a good
"choke point" where we could say for sure that a live tuple is processed
exactly once, but fumbled just like you.

--
Heikki Linnakangas
Neon (https://neon.tech)

Attachment Content-Type Size
v8-0001-lazy_scan_prune-tests-tuple-vis-with-GlobalVisTes.patch text/x-patch 2.1 KB
v8-0002-Pass-heap_prune_chain-PruneResult-output-paramete.patch text/x-patch 3.3 KB
v8-0003-Rename-PruneState-snapshotConflictHorizon-to-late.patch text/x-patch 2.5 KB
v8-0004-heap_page_prune-sets-all_visible-and-visibility_c.patch text/x-patch 19.0 KB
v8-0005-Add-reference-to-VacuumCutoffs-in-HeapPageFreeze.patch text/x-patch 4.7 KB
v8-0006-Prepare-freeze-tuples-in-heap_page_prune.patch text/x-patch 11.8 KB
v8-0007-lazy_scan_prune-reorder-freeze-execution-logic.patch text/x-patch 5.9 KB
v8-0008-Execute-freezing-in-heap_page_prune.patch text/x-patch 31.4 KB
v8-0009-Make-opp-freeze-heuristic-compatible-with-prune-f.patch text/x-patch 4.4 KB
v8-0010-Separate-tuple-pre-freeze-checks-and-invoke-earli.patch text/x-patch 7.4 KB
v8-0011-Remove-heap_freeze_execute_prepared.patch text/x-patch 8.3 KB
v8-0012-Merge-prune-and-freeze-records.patch text/x-patch 11.6 KB
v8-0013-Set-hastup-in-heap_page_prune.patch text/x-patch 7.7 KB
v8-0014-Count-tuples-for-vacuum-logging-in-heap_page_prun.patch text/x-patch 15.7 KB
v8-0015-Save-dead-tuple-offsets-during-heap_page_prune.patch text/x-patch 6.8 KB
v8-0016-move-live-tuple-accounting-to-heap_prune_chain.patch text/x-patch 41.7 KB
v8-0017-Move-frozen-array-to-PruneState.patch text/x-patch 5.7 KB
v8-0018-Cosmetic-fixes.patch text/x-patch 1.9 KB
v8-0019-Almost-cosmetic-fixes.patch text/x-patch 1.3 KB
v8-0020-Move-frz_conflict_horizon-to-tighter-scope.patch text/x-patch 3.1 KB
v8-0021-Add-comment-about-a-pre-existing-issue.patch text/x-patch 1.9 KB
v8-0022-WIP.patch text/x-patch 46.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Imseih (AWS), Sami 2024-03-27 23:08:24 Re: Psql meta-command conninfo+
Previous Message Jelte Fennema-Nio 2024-03-27 23:00:43 Re: Possibility to disable `ALTER SYSTEM`