Re: Combine Prune and Freeze records emitted by vacuum

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: Melanie Plageman <melanieplageman(at)gmail(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>
Subject: Re: Combine Prune and Freeze records emitted by vacuum
Date: 2024-04-01 14:17:51
Message-ID: e7b3116c-79b0-4864-a8d2-5b59b498f724@iki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 30/03/2024 07:57, Melanie Plageman wrote:
> On Fri, Mar 29, 2024 at 12:32:21PM -0400, Melanie Plageman wrote:
>> On Fri, Mar 29, 2024 at 12:00 PM Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:
>>> Here's another idea: In the first loop through the offsets, where we
>>> gather the HTSV status of each item, also collect the offsets of all HOT
>>> and non-HOT items to two separate arrays. Call heap_prune_chain() for
>>> all the non-HOT items first, and then process any remaining HOT tuples
>>> that haven't been marked yet.
>>
>> That's an interesting idea. I'll try it out and see how it works.
>
> Attached v10 implements this method of dividing tuples into HOT and
> non-HOT and processing the potential HOT chains first then processing
> tuples not marked by calling heap_prune_chain().
>
> I have applied the refactoring of heap_prune_chain() to master and then
> built the other patches on top of that.

Committed some of the changes. Continuing to reviewing the rest.

> I discovered while writing this that LP_DEAD item offsets must be in
> order in the deadoffsets array (the one that is used to populate
> LVRelState->dead_items).
>
> When I changed heap_page_prune_and_freeze() to partition the offsets
> into HOT and non-HOT during the first loop through the item pointers
> array (where we get tuple visibility information), we add dead item
> offsets as they are encountered. So, they are no longer in order. I've
> added a quicksort of the deadoffsets array to satisfy vacuum.

Good catch.

> I think that we are actually successfully removing more RECENTLY_DEAD
> HOT tuples than in master with heap_page_prune()'s new approach, and I
> think it is correct; but let me know if I am missing something.

Yep. In the case of a two-item chain, RECENTLY_DEAD -> DEAD, the new
code can always remove both items. On 'master', it depends on which item
it happens to process first. If it processes the RECENTLY_DEAD item
first, then it follows the chain and removes both. But if it processes
the DEAD item first, the RECENTLY_DEAD item is left behind. It will be
removed by the next VACUUM, so it's not a correctness issue, and
probably doesn't make any practical performance difference either as
it's a rare corner case, but I feel better that it's more deterministic now.

> The early patches in the set include some additional comment cleanup as
> well. 0001 is fairly polished. 0004 could use some variable renaming
> (this patch partitions the tuples into HOT and not HOT and then
> processes them). I was struggling with some of the names here
> (chainmembers and chaincandidates is confusing).

I didn't understand why you wanted to juggle both partitions in the same
array. So I separated them into two arrays, and called them 'root_items'
and 'heaponly_items'.

In some micro-benchmarks, the order that the items were processed made a
measurable difference. So I'm processing the items in the reverse order.
That roughly matches the order the items are processed in master, as it
iterates the offsets from high-to-low in the first loop, and low-to-high
in the second loop.

> The bulk of the combining of pruning and freezing is lumped into 0010.
>
> I had planned to separate 0010 into 4 separate patches: 1 to execute
> freezing in heap_prune_chain(), 1 for the freeze heuristic approximating
> what is on master, and 1 for emitting a single record containing both
> the pruning and freezing page modifications.
>
> I ended up not doing this because I felt like the grouping of changes in
> 0007-0009 is off. As long as I still execute freezing in
> lazy_scan_prune(), I have to share lots of state between
> lazy_scan_prune() and heap_page_prune(). This meant I added a lot of
> parameters to heap_page_prune() that later commits removed -- making the
> later patches noisy and not so easy to understand.
>
> I'm actually not sure what should go in what commit (either for review
> clarity or for the actual final version).
>
> But, I think we should probably focus on review of the code and not as
> much how it is split up yet.

Yeah, that's fine, 0010 is manageable-sized now.

> The final state of the code could definitely use more cleanup. I've been
> staring at it for awhile, so I could use some thoughts/ideas about what
> part to focus on improving.

Committed some of the changes. I plan to commit at least the first of
these remaining patches later today. I'm happy with it now, but I'll
give it a final glance over after dinner.

I'll continue to review the rest after that, but attached is what I have
now.

--
Heikki Linnakangas
Neon (https://neon.tech)

Attachment Content-Type Size
v11-0001-Handle-non-chain-tuples-outside-of-heap_prune_ch.patch text/x-patch 15.5 KB
v11-0002-Invoke-heap_prune_record_prunable-during-record-.patch text/x-patch 8.2 KB
v11-0003-Introduce-PRUNE_DO_-actions.patch text/x-patch 8.0 KB
v11-0004-Prepare-freeze-tuples-in-heap_page_prune.patch text/x-patch 17.0 KB
v11-0005-Set-hastup-in-heap_page_prune.patch text/x-patch 5.8 KB
v11-0006-Save-dead-tuple-offsets-during-heap_page_prune.patch text/x-patch 7.3 KB
v11-0007-Combine-freezing-and-pruning.patch text/x-patch 69.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Leung, Anthony 2024-04-01 14:29:29 Re: Allow non-superuser to cancel superuser tasks.
Previous Message David E. Wheeler 2024-04-01 14:15:44 Re: Security lessons from liblzma