Re: Confine vacuum skip logic to lazy_scan_skip

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: Melanie Plageman <melanieplageman(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, vignesh C <vignesh21(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Confine vacuum skip logic to lazy_scan_skip
Date: 2024-03-11 09:29:44
Message-ID: 27d51713-b1dc-49d1-9b3a-2650b7fb9613@iki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 08/03/2024 19:34, Melanie Plageman wrote:
> On Fri, Mar 08, 2024 at 06:07:33PM +0200, Heikki Linnakangas wrote:
>> On 08/03/2024 02:46, Melanie Plageman wrote:
>>> On Wed, Mar 06, 2024 at 10:00:23PM -0500, Melanie Plageman wrote:
>>>> On Wed, Mar 06, 2024 at 09:55:21PM +0200, Heikki Linnakangas wrote:
>>> However, by adding a vmbuffer to next_block_state, the callback may be
>>> able to avoid extra VM fetches from one invocation to the next.
>>
>> That's a good idea, holding separate VM buffer pins for the next-unskippable
>> block and the block we're processing. I adopted that approach.
>
> Cool. It can't be avoided with streaming read vacuum, but I wonder if
> there would ever be adverse effects to doing it on master? Maybe if we
> are doing a lot of skipping and the block of the VM for the heap blocks
> we are processing ends up changing each time but we would have had the
> right block of the VM if we used the one from
> heap_vac_scan_next_block()?
>
> Frankly, I'm in favor of just doing it now because it makes
> lazy_scan_heap() less confusing.

+1

>> I'm pretty happy with the attached patches now. The first one fixes the
>> existing bug I mentioned in the other email (based on the on-going
>> discussion that might not how we want to fix it though).
>
> ISTM we should still do the fix you mentioned -- seems like it has more
> upsides than downsides?
>
>> From b68cb29c547de3c4acd10f31aad47b453d154666 Mon Sep 17 00:00:00 2001
>> From: Heikki Linnakangas <heikki(dot)linnakangas(at)iki(dot)fi>
>> Date: Fri, 8 Mar 2024 16:00:22 +0200
>> Subject: [PATCH v8 1/3] Set all_visible_according_to_vm correctly with
>> DISABLE_PAGE_SKIPPING
>>
>> It's important for 'all_visible_according_to_vm' to correctly reflect
>> whether the VM bit is set or not, even when we are not trusting the VM
>> to skip pages, because contrary to what the comment said,
>> lazy_scan_prune() relies on it.
>>
>> If it's incorrectly set to 'false', when the VM bit is in fact set,
>> lazy_scan_prune() will try to set the VM bit again and dirty the page
>> unnecessarily. As a result, if you used DISABLE_PAGE_SKIPPING, all
>> heap pages were dirtied, even if there were no changes. We would also
>> fail to clear any VM bits that were set incorrectly.
>>
>> This was broken in commit 980ae17310, so backpatch to v16.
>
> LGTM.

Committed and backpatched this.

>> From 47af1ca65cf55ca876869b43bff47f9d43f0750e Mon Sep 17 00:00:00 2001
>> From: Heikki Linnakangas <heikki(dot)linnakangas(at)iki(dot)fi>
>> Date: Fri, 8 Mar 2024 17:32:19 +0200
>> Subject: [PATCH v8 2/3] Confine vacuum skip logic to lazy_scan_skip()
>> ---
>> src/backend/access/heap/vacuumlazy.c | 256 +++++++++++++++------------
>> 1 file changed, 141 insertions(+), 115 deletions(-)
>>
>> diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
>> index ac55ebd2ae5..0aa08762015 100644
>> --- a/src/backend/access/heap/vacuumlazy.c
>> +++ b/src/backend/access/heap/vacuumlazy.c
>> @@ -204,6 +204,12 @@ typedef struct LVRelState
>> int64 live_tuples; /* # live tuples remaining */
>> int64 recently_dead_tuples; /* # dead, but not yet removable */
>> int64 missed_dead_tuples; /* # removable, but not removed */
>
> Perhaps we should add a comment to the blkno member of LVRelState
> indicating that it is used for error reporting and logging?

Well, it's already under the "/* Error reporting state */" section. I
agree this is a little confusing, the name 'blkno' doesn't convey that
it's supposed to be used just for error reporting. But it's a
pre-existing issue so I left it alone. It can be changed with a separate
patch if we come up with a good idea.

> I see why you removed my treatise-level comment that was here about
> unskipped skippable blocks. However, when I was trying to understand
> this code, I did wish there was some comment that explained to me why we
> needed all of the variables next_unskippable_block,
> next_unskippable_allvis, all_visible_according_to_vm, and current_block.
>
> The idea that we would choose not to skip a skippable block because of
> kernel readahead makes sense. The part that I had trouble wrapping my
> head around was that we want to also keep the visibility status of both
> the beginning and ending blocks of the skippable range and then use
> those to infer the visibility status of the intervening blocks without
> another VM lookup if we decide not to skip them.

Right, I removed the comment because looked a little out of place and it
duplicated the other comments sprinkled in the function. I agree this
could still use some more comments though.

Here's yet another attempt at making this more readable. I moved the
logic to find the next unskippable block to a separate function, and
added comments to make the states more explicit. What do you think?

--
Heikki Linnakangas
Neon (https://neon.tech)

Attachment Content-Type Size
v9-0001-Confine-vacuum-skip-logic-to-lazy_scan_skip.patch text/x-patch 13.9 KB
v9-0002-Remove-unneeded-vacuum_delay_point-from-heap_vac_.patch text/x-patch 1.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Korotkov 2024-03-11 09:48:25 Re: May be BUG. Periodic burst growth of the checkpoint_req counter on replica.
Previous Message Peter Eisentraut 2024-03-11 09:29:38 Re: automating RangeTblEntry node support