Re: Confine vacuum skip logic to lazy_scan_skip

From: vignesh C <vignesh21(at)gmail(dot)com>
To: Melanie Plageman <melanieplageman(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>
Subject: Re: Confine vacuum skip logic to lazy_scan_skip
Date: 2024-01-26 13:28:05
Message-ID: CALDaNm3y-kxuJzgqHrJUVO-B=JPE3tXJazoowEkKL745aZH6Zw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, 12 Jan 2024 at 05:12, Melanie Plageman
<melanieplageman(at)gmail(dot)com> wrote:
>
> v3 attached
>
> On Thu, Jan 4, 2024 at 3:23 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> >
> > Hi,
> >
> > On 2024-01-02 12:36:18 -0500, Melanie Plageman wrote:
> > > Subject: [PATCH v2 1/6] lazy_scan_skip remove unnecessary local var rel_pages
> > > Subject: [PATCH v2 2/6] lazy_scan_skip remove unneeded local var
> > > nskippable_blocks
> >
> > I think these may lead to worse code - the compiler has to reload
> > vacrel->rel_pages/next_unskippable_block for every loop iteration, because it
> > can't guarantee that they're not changed within one of the external functions
> > called in the loop body.
>
> I buy that for 0001 but 0002 is still using local variables.
> nskippable_blocks was just another variable to keep track of even
> though we could already get that info from local variables
> next_unskippable_block and next_block.
>
> In light of this comment, I've refactored 0003/0004 (0002 and 0003 in
> this version [v3]) to use local variables in the loop as well. I had
> started using the members of the VacSkipState which I introduced.
>
> > > Subject: [PATCH v2 3/6] Add lazy_scan_skip unskippable state
> > >
> > > Future commits will remove all skipping logic from lazy_scan_heap() and
> > > confine it to lazy_scan_skip(). To make those commits more clear, first
> > > introduce the struct, VacSkipState, which will maintain the variables
> > > needed to skip ranges less than SKIP_PAGES_THRESHOLD.
> >
> > Why not add this to LVRelState, possibly as a struct embedded within it?
>
> Done in attached.
>
> > > From 335faad5948b2bec3b83c2db809bb9161d373dcb Mon Sep 17 00:00:00 2001
> > > From: Melanie Plageman <melanieplageman(at)gmail(dot)com>
> > > Date: Sat, 30 Dec 2023 16:59:27 -0500
> > > Subject: [PATCH v2 4/6] Confine vacuum skip logic to lazy_scan_skip
> >
> > > By always calling lazy_scan_skip() -- instead of only when we have reached the
> > > next unskippable block, we no longer need the skipping_current_range variable.
> > > lazy_scan_heap() no longer needs to manage the skipped range -- checking if we
> > > reached the end in order to then call lazy_scan_skip(). And lazy_scan_skip()
> > > can derive the visibility status of a block from whether or not we are in a
> > > skippable range -- that is, whether or not the next_block is equal to the next
> > > unskippable block.
> >
> > I wonder if it should be renamed as part of this - the name is somewhat
> > confusing now (and perhaps before)? lazy_scan_get_next_block() or such?
>
> Why stop there! I've removed lazy and called it
> heap_vac_scan_get_next_block() -- a little long, but...
>
> > > + while (true)
> > > {
> > > Buffer buf;
> > > Page page;
> > > - bool all_visible_according_to_vm;
> > > LVPagePruneState prunestate;
> > >
> > > - if (blkno == vacskip.next_unskippable_block)
> > > - {
> > > - /*
> > > - * Can't skip this page safely. Must scan the page. But
> > > - * determine the next skippable range after the page first.
> > > - */
> > > - all_visible_according_to_vm = vacskip.next_unskippable_allvis;
> > > - lazy_scan_skip(vacrel, &vacskip, blkno + 1);
> > > -
> > > - Assert(vacskip.next_unskippable_block >= blkno + 1);
> > > - }
> > > - else
> > > - {
> > > - /* Last page always scanned (may need to set nonempty_pages) */
> > > - Assert(blkno < rel_pages - 1);
> > > -
> > > - if (vacskip.skipping_current_range)
> > > - continue;
> > > + blkno = lazy_scan_skip(vacrel, &vacskip, blkno + 1,
> > > + &all_visible_according_to_vm);
> > >
> > > - /* Current range is too small to skip -- just scan the page */
> > > - all_visible_according_to_vm = true;
> > > - }
> > > + if (blkno == InvalidBlockNumber)
> > > + break;
> > >
> > > vacrel->scanned_pages++;
> > >
> >
> > I don't like that we still do determination about the next block outside of
> > lazy_scan_skip() and have duplicated exit conditions between lazy_scan_skip()
> > and lazy_scan_heap().
> >
> > I'd probably change the interface to something like
> >
> > while (lazy_scan_get_next_block(vacrel, &blkno))
> > {
> > ...
> > }
>
> I've done this. I do now find the parameter names a bit confusing.
> There is next_block (which is the "next block in line" and is an input
> parameter) and blkno, which is an output parameter with the next block
> that should actually be processed. Maybe it's okay?
>
> > > From b6603e35147c4bbe3337280222e6243524b0110e Mon Sep 17 00:00:00 2001
> > > From: Melanie Plageman <melanieplageman(at)gmail(dot)com>
> > > Date: Sun, 31 Dec 2023 09:47:18 -0500
> > > Subject: [PATCH v2 5/6] VacSkipState saves reference to LVRelState
> > >
> > > The streaming read interface can only give pgsr_next callbacks access to
> > > two pieces of private data. As such, move a reference to the LVRelState
> > > into the VacSkipState.
> > >
> > > This is a separate commit (as opposed to as part of the commit
> > > introducing VacSkipState) because it is required for using the streaming
> > > read interface but not a natural change on its own. VacSkipState is per
> > > block and the LVRelState is referenced for the whole relation vacuum.
> >
> > I'd do it the other way round, i.e. either embed VacSkipState ino LVRelState
> > or point to it from VacSkipState.
> >
> > LVRelState is already tied to the iteration state, so I don't think there's a
> > reason not to do so.
>
> Done, and, as such, this patch is dropped from the set.

CFBot shows that the patch does not apply anymore as in [1]:
=== applying patch
./v3-0002-Add-lazy_scan_skip-unskippable-state-to-LVRelStat.patch
patching file src/backend/access/heap/vacuumlazy.c
...
Hunk #10 FAILED at 1042.
Hunk #11 FAILED at 1121.
Hunk #12 FAILED at 1132.
Hunk #13 FAILED at 1161.
Hunk #14 FAILED at 1172.
Hunk #15 FAILED at 1194.
...
6 out of 21 hunks FAILED -- saving rejects to file
src/backend/access/heap/vacuumlazy.c.rej

Please post an updated version for the same.

[1] - http://cfbot.cputube.org/patch_46_4755.log

Regards,
Vignesh

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2024-01-26 13:42:17 Re: tablecmds.c/MergeAttributes() cleanup
Previous Message vignesh C 2024-01-26 13:23:43 Re: [PATCH] Compression dictionaries for JSONB