Re: Confine vacuum skip logic to lazy_scan_skip

From: Melanie Plageman <melanieplageman(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>
Subject: Re: Confine vacuum skip logic to lazy_scan_skip
Date: 2024-01-11 23:41:52
Message-ID: CAAKRu_b9kbXeK05_jTqBayhsRNeruyWHD6i8RPWoVtGxSV4D9g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

v3 attached

On Thu, Jan 4, 2024 at 3:23 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
>
> Hi,
>
> On 2024-01-02 12:36:18 -0500, Melanie Plageman wrote:
> > Subject: [PATCH v2 1/6] lazy_scan_skip remove unnecessary local var rel_pages
> > Subject: [PATCH v2 2/6] lazy_scan_skip remove unneeded local var
> > nskippable_blocks
>
> I think these may lead to worse code - the compiler has to reload
> vacrel->rel_pages/next_unskippable_block for every loop iteration, because it
> can't guarantee that they're not changed within one of the external functions
> called in the loop body.

I buy that for 0001 but 0002 is still using local variables.
nskippable_blocks was just another variable to keep track of even
though we could already get that info from local variables
next_unskippable_block and next_block.

In light of this comment, I've refactored 0003/0004 (0002 and 0003 in
this version [v3]) to use local variables in the loop as well. I had
started using the members of the VacSkipState which I introduced.

> > Subject: [PATCH v2 3/6] Add lazy_scan_skip unskippable state
> >
> > Future commits will remove all skipping logic from lazy_scan_heap() and
> > confine it to lazy_scan_skip(). To make those commits more clear, first
> > introduce the struct, VacSkipState, which will maintain the variables
> > needed to skip ranges less than SKIP_PAGES_THRESHOLD.
>
> Why not add this to LVRelState, possibly as a struct embedded within it?

Done in attached.

> > From 335faad5948b2bec3b83c2db809bb9161d373dcb Mon Sep 17 00:00:00 2001
> > From: Melanie Plageman <melanieplageman(at)gmail(dot)com>
> > Date: Sat, 30 Dec 2023 16:59:27 -0500
> > Subject: [PATCH v2 4/6] Confine vacuum skip logic to lazy_scan_skip
>
> > By always calling lazy_scan_skip() -- instead of only when we have reached the
> > next unskippable block, we no longer need the skipping_current_range variable.
> > lazy_scan_heap() no longer needs to manage the skipped range -- checking if we
> > reached the end in order to then call lazy_scan_skip(). And lazy_scan_skip()
> > can derive the visibility status of a block from whether or not we are in a
> > skippable range -- that is, whether or not the next_block is equal to the next
> > unskippable block.
>
> I wonder if it should be renamed as part of this - the name is somewhat
> confusing now (and perhaps before)? lazy_scan_get_next_block() or such?

Why stop there! I've removed lazy and called it
heap_vac_scan_get_next_block() -- a little long, but...

> > + while (true)
> > {
> > Buffer buf;
> > Page page;
> > - bool all_visible_according_to_vm;
> > LVPagePruneState prunestate;
> >
> > - if (blkno == vacskip.next_unskippable_block)
> > - {
> > - /*
> > - * Can't skip this page safely. Must scan the page. But
> > - * determine the next skippable range after the page first.
> > - */
> > - all_visible_according_to_vm = vacskip.next_unskippable_allvis;
> > - lazy_scan_skip(vacrel, &vacskip, blkno + 1);
> > -
> > - Assert(vacskip.next_unskippable_block >= blkno + 1);
> > - }
> > - else
> > - {
> > - /* Last page always scanned (may need to set nonempty_pages) */
> > - Assert(blkno < rel_pages - 1);
> > -
> > - if (vacskip.skipping_current_range)
> > - continue;
> > + blkno = lazy_scan_skip(vacrel, &vacskip, blkno + 1,
> > + &all_visible_according_to_vm);
> >
> > - /* Current range is too small to skip -- just scan the page */
> > - all_visible_according_to_vm = true;
> > - }
> > + if (blkno == InvalidBlockNumber)
> > + break;
> >
> > vacrel->scanned_pages++;
> >
>
> I don't like that we still do determination about the next block outside of
> lazy_scan_skip() and have duplicated exit conditions between lazy_scan_skip()
> and lazy_scan_heap().
>
> I'd probably change the interface to something like
>
> while (lazy_scan_get_next_block(vacrel, &blkno))
> {
> ...
> }

I've done this. I do now find the parameter names a bit confusing.
There is next_block (which is the "next block in line" and is an input
parameter) and blkno, which is an output parameter with the next block
that should actually be processed. Maybe it's okay?

> > From b6603e35147c4bbe3337280222e6243524b0110e Mon Sep 17 00:00:00 2001
> > From: Melanie Plageman <melanieplageman(at)gmail(dot)com>
> > Date: Sun, 31 Dec 2023 09:47:18 -0500
> > Subject: [PATCH v2 5/6] VacSkipState saves reference to LVRelState
> >
> > The streaming read interface can only give pgsr_next callbacks access to
> > two pieces of private data. As such, move a reference to the LVRelState
> > into the VacSkipState.
> >
> > This is a separate commit (as opposed to as part of the commit
> > introducing VacSkipState) because it is required for using the streaming
> > read interface but not a natural change on its own. VacSkipState is per
> > block and the LVRelState is referenced for the whole relation vacuum.
>
> I'd do it the other way round, i.e. either embed VacSkipState ino LVRelState
> or point to it from VacSkipState.
>
> LVRelState is already tied to the iteration state, so I don't think there's a
> reason not to do so.

Done, and, as such, this patch is dropped from the set.

- Melane

Attachment Content-Type Size
v3-0004-Remove-unneeded-vacuum_delay_point-from-heap_vac_.patch text/x-patch 1.0 KB
v3-0002-Add-lazy_scan_skip-unskippable-state-to-LVRelStat.patch text/x-patch 13.6 KB
v3-0003-Confine-vacuum-skip-logic-to-lazy_scan_skip.patch text/x-patch 15.1 KB
v3-0001-lazy_scan_skip-remove-unneeded-local-var-nskippab.patch text/x-patch 1.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Melanie Plageman 2024-01-11 23:50:50 Re: Confine vacuum skip logic to lazy_scan_skip
Previous Message Michael Paquier 2024-01-11 23:37:49 Re: A recent message added to pg_upgade