Re: Combine Prune and Freeze records emitted by vacuum

From: Melanie Plageman <melanieplageman(at)gmail(dot)com>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>
Subject: Re: Combine Prune and Freeze records emitted by vacuum
Date: 2024-03-27 18:26:34
Message-ID: CAAKRu_YFOyMF=p+cV7DzZ=+X_tsREO8vVucEVrV6uMrdSRW8AQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Mar 27, 2024 at 12:18 PM Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:
>
> On 27/03/2024 17:18, Melanie Plageman wrote:
> > I need some way to modify the control flow or accounting such that I
> > know which HEAPTUPLE_RECENTLY_DEAD tuples will not be marked LP_DEAD.
> > And a way to consider freezing and do live tuple accounting for these
> > and HEAPTUPLE_LIVE tuples exactly once.
>
> Just a quick update: I've been massaging this some more today, and I
> think I'm onto got something palatable. I'll send an updated patch later
> today, but the key is to note that for each item on the page, there is
> one point where we determine the fate of the item, whether it's pruned
> or not. That can happen in different points in in heap_page_prune().
> That's also when we set marked[offnum] = true. Whenever that happens, we
> all call one of the a heap_page_prune_record_*() subroutines. We already
> have those subroutines for when a tuple is marked as dead or unused, but
> let's add similar subroutines for the case that we're leaving the tuple
> unchanged. If we move all the bookkeeping logic to those subroutines, we
> can ensure that it gets done exactly once for each tuple, and at that
> point we know what we are going to do to the tuple, so we can count it
> correctly. So heap_prune_chain() decides what to do with each tuple, and
> ensures that each tuple is marked only once, and the subroutines update
> all the variables, add the item to the correct arrays etc. depending on
> what we're doing with it.

Yes, this would be ideal.

I was doing some experimentation with pageinspect today (trying to
find that single place where live tuples fates are decided) and it
seems like a heap-only tuple that is not HOT-updated will usually be
the one at the end of the chain. Which seems like it would be covered
by adding a record_live() type function call in the loop of
heap_prune_chain():

/*
* If the tuple is not HOT-updated, then we are at the end of this
* HOT-update chain.
*/
if (!HeapTupleHeaderIsHotUpdated(htup))
{
heap_prune_record_live_or_recently_dead(dp, prstate,
offnum, presult);
break;
}

but that doesn't end up producing the same results as

if (HeapTupleHeaderIsHeapOnly(htup)
&& !HeapTupleHeaderIsHotUpdated(htup) &&
presult->htsv[rootoffnum] == HEAPTUPLE_DEAD)
heap_prune_record_live_or_recently_dead(dp, prstate,
offnum, presult);

at the top of heap_prune_chain().

- Melanie

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Melanie Plageman 2024-03-27 18:36:46 Re: Combine Prune and Freeze records emitted by vacuum
Previous Message Bharath Rupireddy 2024-03-27 17:49:21 Re: Add new error_action COPY ON_ERROR "log"