From: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi> |
---|---|
To: | Melanie Plageman <melanieplageman(at)gmail(dot)com> |
Cc: | Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie> |
Subject: | Re: Combine Prune and Freeze records emitted by vacuum |
Date: | 2024-03-29 16:00:28 |
Message-ID: | 4d1bd0d9-ee1d-4a52-9723-ca35ed40c1af@iki.fi |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 29/03/2024 07:04, Melanie Plageman wrote:
> On Thu, Mar 28, 2024 at 11:07:10AM -0400, Melanie Plageman wrote:
>> These comments could use another pass. I had added some extra
>> (probably redundant) content when I thought I was refactoring it a
>> certain way and then changed my mind.
>>
>> Attached is a diff with some ideas I had for a bit of code simplification.
>>
>> Are you working on cleaning this patch up or should I pick it up?
>
> Attached v9 is rebased over master. But, more importantly, I took
> another pass at heap_prune_chain() and am pretty happy with what I came
> up with. See 0021. I simplified the traversal logic and then grouped the
> chain processing into three branches at the end. I find it much easier
> to understand what we are doing for different types of HOT chains.
Ah yes, agreed, that's nicer.
The 'survivor' variable is a little confusing, especially here:
if (!survivor)
{
int i;
/*
* If no DEAD tuple was found, and the root is redirected, mark it as
* such.
*/
if ((i = ItemIdIsRedirected(rootlp)))
heap_prune_record_unchanged_lp_redirect(prstate, rootoffnum);
/* the rest of tuples in the chain are normal, unchanged tuples */
for (; i < nchain; i++)
heap_prune_record_unchanged(dp, prstate, chainitems[i]);
}
You would think that "if(!survivor)", it means that there is no live
tuples on the chain, i.e. no survivors. But in fact it's the opposite;
it means that the are all live. Maybe call it 'ndeadchain' instead,
meaning the number of dead items in the chain.
> I got rid of revisited. We can put it back, but I was thinking: we stash
> all HOT tuples and then loop over them later, calling record_unchanged()
> on the ones that aren't marked. But, if we have a lot of HOT tuples, is
> this really that much better than just looping through all the offsets
> and calling record_unchanged() on just the ones that aren't marked?
Well, it requires looping through all the offsets one more time, and
unless you have a lot of HOT tuples, most items would be marked already.
But maybe the overhead is negligible anyway.
> I've done that in my version. While testing this, I found that only
> on-access pruning needed this final loop calling record_unchanged() on
> items not yet marked. I know we can't skip this final loop entirely in
> the ON ACCESS case because it calls record_prunable(), but we could
> consider moving that back out into heap_prune_chain()? Or what do you
> think?
Hmm, why is that different with on-access pruning?
Here's another idea: In the first loop through the offsets, where we
gather the HTSV status of each item, also collect the offsets of all HOT
and non-HOT items to two separate arrays. Call heap_prune_chain() for
all the non-HOT items first, and then process any remaining HOT tuples
that haven't been marked yet.
> I haven't finished updating all the comments, but I am really interested
> to know what you think about heap_prune_chain() now.
Looks much better now, thanks!
--
Heikki Linnakangas
Neon (https://neon.tech)
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2024-03-29 16:06:08 | Re: Possibility to disable `ALTER SYSTEM` |
Previous Message | Nathan Bossart | 2024-03-29 15:59:40 | Re: Popcount optimization using AVX512 |