Re: New IndexAM API controlling index vacuum strategies

From: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Noah Misch <noah(at)leadboat(dot)com>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
Subject: Re: New IndexAM API controlling index vacuum strategies
Date: 2021-04-05 21:44:01
Message-ID: CAEze2Wh-nXjkp0bLN_vQwgHttC8CRH=1ewcrWk+7RX5B93YQPQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, 4 Apr 2021 at 04:00, Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
>
> On Thu, Apr 1, 2021 at 8:25 PM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> > I've also found a way to further simplify the table-without-indexes
> > case: make it behave like a regular two-pass/has-indexes VACUUM with
> > regard to visibility map stuff when the page doesn't need a call to
> > lazy_vacuum_heap() (because there are no LP_DEAD items to set
> > LP_UNUSED on the page following pruning). But when it does call
> > lazy_vacuum_heap(), the call takes care of everything for
> > lazy_scan_heap(), which just continues to the next page due to
> > considering prunestate to have been "invalidated" by the call to
> > lazy_vacuum_heap(). So there is absolutely minimal special case code
> > for the table-without-indexes case now.
>
> Attached is v10, which simplifies the one-pass/table-without-indexes
> VACUUM as described.

Great!

> Other changes (some of which are directly related to the
> one-pass/table-without-indexes refactoring):
>
> * The second patch no longer breaks up lazy_scan_heap() into multiple
> functions -- we only retain the lazy_scan_prune() function, which is
> the one that I find very compelling.
>
> This addresses Robert's concern about the functions -- I think that
> it's much better this way, now that I see it.
>
> * No more diff churn in the first patch. This was another concern held
> by Robert, as well as by Masahiko.
>
> In general both the first and second patches are much easier to follow now.
>
> * The emergency mechanism is now able to kick in when we happen to be
> doing a one-pass/table-without-indexes VACUUM -- no special
> cases/"weasel words" are needed.
>
> * Renamed "onerel" to "rel" in the first patch, per Robert's suggestion.
>
> * Fixed various specific issues raised by Masahiko's review,
> particularly in the first patch and last patch in the series.
>
> Finally, there is a new patch added to the series in v10:
>
> * I now include a modified version of Matthias van de Meent's line
> pointer truncation patch [1].

Thanks for notifying. I've noticed that you've based this on v3 of
that patch, and consequently has at least one significant bug that I
fixed in v5 of that patchset:

0004:
> @@ -962,6 +962,7 @@ heap_get_root_tuples(Page page, OffsetNumber *root_offsets)
> */
> for (;;)
> {
> + Assert(OffsetNumberIsValid(nextoffnum) && nextoffnum <= maxoff);
> lp = PageGetItemId(page, nextoffnum);
>
> /* Check for broken chains */

This assertion is false, and should be a guarding if-statement. HOT
redirect pointers are not updated if the tuple they're pointing to is
vacuumed (i.e. when it was never committed) so this nextoffnum might
in a correctly working system point past maxoff.

> Line pointer truncation doesn't happen during pruning, as it did in
> Matthias' original patch. In this revised version, line pointer
> truncation occurs during the second phase of VACUUM. There are several
> reasons to prefer this approach. It seems both safer and more useful
> that way (compared to the approach of doing line pointer truncation
> during pruning). It also makes intuitive sense to do it this way, at
> least to me -- the second pass over the heap is supposed to be for
> "freeing" LP_DEAD line pointers.

Good catch for running a line pointer truncating pass at the second
pass over the heap in VACUUM, but I believe that it is also very
useful for pruning. Line pointer bloat due to excessive HOT chains
cannot be undone until the 2nd run of VACUUM happens with this patch,
which is a missed chance for all non-vacuum pruning.

> Many workloads rely heavily on opportunistic pruning. With a workload
> that benefits a lot from HOT (e.g. pgbench with heap fillfactor
> reduced to 90), there are many LP_UNUSED line pointers, even though we
> may never have a VACUUM that actually performs a second heap pass
> (because LP_DEAD items cannot accumulate in heap pages). Prior to the
> HOT commit in 2007, LP_UNUSED line pointers were strictly something
> that VACUUM created from dead tuples. It seems to me that we should
> only target the latter "category" of LP_UNUSED line pointers when
> considering truncating the array -- we ought to leave pruning
> (especially opportunistic pruning that takes place outside of VACUUM)
> alone.

What difference is there between opportunistically pruned HOT line
pointers, and VACUUMed line pointers? Truncating during pruning has
the benefit of keeping the LP array short where possible, and seeing
that truncating the LP array allows for more applied
PD_HAS_FREE_LINES-optimization, I fail to see why you wouldn't want to
truncate the LP array whenever clearing up space.

Other than those questions, some comments on the other patches:

0002:
> + appendStringInfo(&buf, _("There were %lld dead item identifiers.\n"),
> + (long long) vacrel->lpdead_item_pages);

I presume this should use vacrel->lpdead_items?.

0003:
> + * ... Aborted transactions
> + * have tuples that we can treat as DEAD without caring about where there
> + * tuple header XIDs ...

This should be '... where their tuple header XIDs...'

> +retry:
> +
> ...
> + res = HeapTupleSatisfiesVacuum(&tuple, vacrel->OldestXmin, buf);
> +
> + if (unlikely(res == HEAPTUPLE_DEAD))
> + goto retry;

In this unlikely case, you reset the tuples_deleted value that was
received earlier from heap_page_prune. This results in inaccurate
statistics, as repeated calls to heap_page_prune on the same page will
not count tuples that were deleted in a previous call.

0004:
> + * truncate to. Note that we avoid truncating the line pointer to 0 items
> + * in all cases.
> + */

Is there a specific reason that I'm not getting as to why this is necessary?

0005:
> + The default is 1.8 billion transactions. Although users can set this value
> + anywhere from zero to 2.1 billion, <command>VACUUM</command> will silently
> + adjust the effective value more than 105% of
> + <xref linkend="guc-autovacuum-freeze-max-age"/>, so that only
> + anti-wraparound autovacuums and aggressive scans have a chance to skip
> + index cleanup.

This documentation doesn't quite make it clear what its relationship
is with autovacuum_freeze_max_age. How about the following: "...
>VACUUM< will use the higher of this value and 105% of
>guc-autovacuum-freeze-max-age<, so that only ...". It's only slightly
easier to read, but at least it conveys that values lower than 105% of
autovacuum_freeze_max_age are not considered. The same can be said for
the multixact guc documentation.

With regards,

Matthias van de Meent

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Mohamed Mansour 2021-04-05 21:46:36 GSoc Applicant
Previous Message Bryn Llewellyn 2021-04-05 20:58:23 Re: Have I found an interval arithmetic bug?