Re: Confine vacuum skip logic to lazy_scan_skip

From: Melanie Plageman <melanieplageman(at)gmail(dot)com>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc: vignesh C <vignesh21(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: Confine vacuum skip logic to lazy_scan_skip
Date: 2024-03-07 03:00:23
Message-ID: 20240307030023.2gajgexeelqtohnu@liskov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Mar 06, 2024 at 09:55:21PM +0200, Heikki Linnakangas wrote:
> On 27/02/2024 21:47, Melanie Plageman wrote:
> > The attached v5 has some simplifications when compared to v4 but takes
> > largely the same approach.
> >
> > 0001-0004 are refactoring
>
> I'm looking at just these 0001-0004 patches for now. I like those changes a
> lot for the sake of readablity even without any of the later patches.

Thanks! And thanks so much for the review!

I've done a small performance experiment comparing a branch with all of
the patches applied (your v6 0001-0009) with master. I made an 11 GB
table that has 1,394,328 blocks. For setup, I vacuumed it to update the
VM and made sure it was entirely in shared buffers. All of this was to
make sure all of the blocks would be skipped and we spend the majority
of the time spinning through the lazy_scan_heap() code. Then I ran
vacuum again (the actual test). I saw vacuum go from 13 ms to 10 ms
with the patches applied.

I think I need to do some profiling to see if the difference is actually
due to our code changes, but I thought I would share preliminary
results.

> I made some further changes. I kept them as separate commits for easier
> review, see the commit messages for details. Any thoughts on those changes?

I've given some inline feedback on most of the extra patches you added.
Short answer is they all seem fine to me except I have a reservations
about 0008 because of the number of blkno variables flying around. I
didn't have a chance to rebase these into my existing changes today, so
either I will do it tomorrow or, if you are feeling like you're on a
roll and want to do it, that also works!

> I feel heap_vac_scan_get_next_block() function could use some love. Maybe
> just some rewording of the comments, or maybe some other refactoring; not
> sure. But I'm pretty happy with the function signature and how it's called.

I was wondering if we should remove the "get" and just go with
heap_vac_scan_next_block(). I didn't do that originally because I didn't
want to imply that the next block was literally the sequentially next
block, but I think maybe I was overthinking it.

Another idea is to call it heap_scan_vac_next_block() and then the order
of the words is more like the table AM functions that get the next block
(e.g. heapam_scan_bitmap_next_block()). Though maybe we don't want it to
be too similar to those since this isn't a table AM callback.

As for other refactoring and other rewording of comments and such, I
will take a pass at this tomorrow.

> BTW, do we have tests that would fail if we botched up
> heap_vac_scan_get_next_block() so that it would skip pages incorrectly, for
> example? Not asking you to write them for this patch, but I'm just
> wondering.

So, while developing this, when I messed up and skipped blocks I
shouldn't, vacuum would error out with the "found xmin from before
relfrozenxid" error -- which would cause random tests to fail. I know
that's not a correctly failing test of this code. I think there might be
some tests in the verify_heapam tests that could/do test this kind of
thing but I don't remember them failing for me during development -- so
I didn't spend much time looking at them.

I would also sometimes get freespace or VM tests that would fail because
those blocks that are incorrectly skipped were meant to be reflected in
the FSM or VM in those tests.

All of that is to say, perhaps we should write a more targeted test?

When I was writing the code, I added logging of skipped blocks and then
came up with different scenarios and ran them on master and with the
patch and diffed the logs.

> From b4047b941182af0643838fde056c298d5cc3ae32 Mon Sep 17 00:00:00 2001
> From: Heikki Linnakangas <heikki(dot)linnakangas(at)iki(dot)fi>
> Date: Wed, 6 Mar 2024 20:13:42 +0200
> Subject: [PATCH v6 5/9] Remove unused 'skipping_current_range' field
>
> ---
> src/backend/access/heap/vacuumlazy.c | 2 --
> 1 file changed, 2 deletions(-)
>
> diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
> index 65d257aab83..51391870bf3 100644
> --- a/src/backend/access/heap/vacuumlazy.c
> +++ b/src/backend/access/heap/vacuumlazy.c
> @@ -217,8 +217,6 @@ typedef struct LVRelState
> Buffer vmbuffer;
> /* Next unskippable block's visibility status */
> bool next_unskippable_allvis;
> - /* Whether or not skippable blocks should be skipped */
> - bool skipping_current_range;
> } skip;
> } LVRelState;
>
> --
> 2.39.2
>

Oops! I thought I removed this. I must have forgotten

> From 27e431e8dc69bbf09d831cb1cf2903d16f177d74 Mon Sep 17 00:00:00 2001
> From: Heikki Linnakangas <heikki(dot)linnakangas(at)iki(dot)fi>
> Date: Wed, 6 Mar 2024 20:58:57 +0200
> Subject: [PATCH v6 6/9] Move vmbuffer back to a local varible in
> lazy_scan_heap()
>
> It felt confusing that we passed around the current block, 'blkno', as
> an argument to lazy_scan_new_or_empty() and lazy_scan_prune(), but
> 'vmbuffer' was accessed directly in the 'scan_state'.
>
> It was also a bit vague, when exactly 'vmbuffer' was valid. Calling
> heap_vac_scan_get_next_block() set it, sometimes, to a buffer that
> might or might not contain the VM bit for 'blkno'. But other
> functions, like lazy_scan_prune(), assumed it to contain the correct
> buffer. That was fixed up visibilitymap_pin(). But clearly it was not
> "owned" by heap_vac_scan_get_next_block(), like the other 'scan_state'
> fields.
>
> I moved it back to a local variable, like it was. Maybe there would be
> even better ways to handle it, but at least this is not worse than
> what we have in master currently.

I'm fine with this. I did it the way I did (grouping it with the
"next_unskippable_block" in the skip struct), because I think that this
vmbuffer is always the buffer containing the VM bit for the next
unskippable block -- which sometimes is the block returned by
heap_vac_scan_get_next_block() and sometimes isn't.

I agree it might be best as a local variable but perhaps we could retain
the comment about it being the block of the VM containing the bit for the
next unskippable block. (Honestly, the whole thing is very confusing).

> From 519e26a01b6e6974f9e0edb94b00756af053f7ee Mon Sep 17 00:00:00 2001
> From: Heikki Linnakangas <heikki(dot)linnakangas(at)iki(dot)fi>
> Date: Wed, 6 Mar 2024 20:27:57 +0200
> Subject: [PATCH v6 7/9] Rename skip_state
>
> I don't want to emphasize the "skipping" part. Rather, it's the state
> onwed by the heap_vac_scan_get_next_block() function

This makes sense to me. Skipping should be private details of vacuum's
get_next_block functionality. Though the name is a bit long. Maybe we
don't need the "get" and "state" parts (it is already in a struct with
state in the name)?

> From 6dfae936a29e2d3479273f8ab47778a596258b16 Mon Sep 17 00:00:00 2001
> From: Heikki Linnakangas <heikki(dot)linnakangas(at)iki(dot)fi>
> Date: Wed, 6 Mar 2024 21:03:19 +0200
> Subject: [PATCH v6 8/9] Track 'current_block' in the skip state
>
> The caller was expected to always pass last blk + 1. It's not clear if
> the next_unskippable block accounting would work correctly if you
> passed something else. So rather than expecting the caller to do that,
> have heap_vac_scan_get_next_block() keep track of the last returned
> block itself, in the 'skip' state.
>
> This is largely redundant with the LVRelState->blkno field. But that
> one is currently only used for error reporting, so it feels best to
> give heap_vac_scan_get_next_block() its own field that it owns.

I understand and agree with you that relying on blkno + 1 is bad and we
should make the "next_block" state keep track of the current block.

Though, I now find it easy to confuse
lvrelstate->get_next_block_state->current_block, lvrelstate->blkno and
the local variable blkno in lazy_scan_heap(). I think it is a naming
thing and not that we shouldn't have all three. I'll think more about it
in the morning.

> From 619556cad4aad68d1711c12b962e9002e56d8db2 Mon Sep 17 00:00:00 2001
> From: Heikki Linnakangas <heikki(dot)linnakangas(at)iki(dot)fi>
> Date: Wed, 6 Mar 2024 21:35:11 +0200
> Subject: [PATCH v6 9/9] Comment & whitespace cleanup
>
> I moved some of the paragraphs to inside the
> heap_vac_scan_get_next_block() function. I found the explanation in
> the function comment at the old place like too much detail. Someone
> looking at the function signature and how to call it would not care
> about all the details of what can or cannot be skipped.

LGTM.

Thanks again.

- Melanie

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message shveta malik 2024-03-07 03:07:01 Re: Synchronizing slots from primary to standby
Previous Message jian he 2024-03-07 02:50:09 Re: remaining sql/json patches