Re: New IndexAM API controlling index vacuum strategies

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Noah Misch <noah(at)leadboat(dot)com>
Subject: Re: New IndexAM API controlling index vacuum strategies
Date: 2021-03-18 02:55:11
Message-ID: CAH2-WznneCXTzuFmcwx_EyRQgfsfJAAsu+CsqRFmFXCAar=nJw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Mar 15, 2021 at 4:11 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> I kinda wonder whether this case should just be handled by just gotoing
> back to the start of the blkno loop, and redoing the pruning. The only
> thing that makes that a bit more complicatd is that we've already
> incremented vacrelstats->{scanned_pages,vacrelstats->tupcount_pages}.
>
> We really should put the per-page work (i.e. the blkno loop body) of
> lazy_scan_heap() into a separate function, same with the
> too-many-dead-tuples branch.

Attached patch series splits everything up. There is now a large patch
that removes the tupgone special case, and a second patch that
actually adds code that dynamically decides to not do index vacuuming
in cases where (for whatever reason) it doesn't seem useful.

Here are some key points about the first patch:

* Eliminates the "tupgone = true" special case by putting pruning, the
HTSV() call, as well as tuple freezing into a new, dedicated function
-- the function is prepared to restart pruning in those rare cases
where the vacuumlazy.c HTSV() call indicates that a tuple is dead.
Restarting pruning again prunes again, rendering the DEAD tuple with
storage an LP_DEAD line pointer stub.

The restart thing is based on Andres' suggestion.

This patch enables incremental VACUUM (the second patch, and likely
other variations) by allowing us to make a uniform assumption that it
is never strictly necessary to reach lazy_vacuum_all_indexes() or
lazy_vacuum_heap(). It is now possible to "end VACUUM early" while
still advancing relfrozenxid. Provided we've finished the first scan
of the heap, that should be safe.

* In principle we could visit and revisit the question of whether or
not vacuuming should continue or end early many times, as new
information comes to light. For example, maybe Masahiko's patch from
today could be taught to monitor how old relfrozenxid is again and
again, before finally giving up early when the risk of wraparound
becomes very severe -- but not before then.

* I've added three structs that replace a blizzard of local variables
we used lazy_scan_heap() with just three (three variables for each of
the three structs). I've also moved several chunks of logic to other
new functions (in addition to one that does pruning and freezing).

I think that I have the data structures roughly right here -- but I
would appreciate some feedback on that. Does this seem like the right
direction?

--
Peter Geoghegan

Attachment Content-Type Size
v3-0002-Skip-index-vacuuming-dynamically.patch application/octet-stream 16.5 KB
v3-0001-Remove-tupgone-special-case-from-vacuumlazy.c.patch application/octet-stream 72.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message kuroda.hayato@fujitsu.com 2021-03-18 02:59:25 RE: make the stats collector shutdown without writing the statsfiles if the immediate shutdown is requested.
Previous Message Justin Pryzby 2021-03-18 02:52:28 Re: Parallel INSERT (INTO ... SELECT ...)