Re: New IndexAM API controlling index vacuum strategies

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: New IndexAM API controlling index vacuum strategies
Date: 2020-12-28 07:41:44
Message-ID: CAH2-Wz=74Z9NPNvU7Zbq5VLTbepkb3_QpG_PVhgGmjmmnAUyAw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Dec 27, 2020 at 10:55 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > As you said, the next question must be: How do we teach lazy vacuum to
> > not do what gets requested by amvacuumcleanup() when it cannot respect
> > the wishes of one individual indexes, for example when the
> > accumulation of LP_DEAD items in the heap becomes a big problem in
> > itself? That really could be the thing that forces full heap
> > vacuuming, even with several indexes.
>
> You mean requested by amvacuumstreategy(), not by amvacuumcleanup()? I
> think amvacuumstrategy() affects only ambulkdelete(). But when all
> ambulkdelete() were skipped by the requests by index AMs we might want
> to skip amvacuumcleanup() as well.

No, I was asking about how we should decide to do a real VACUUM even
(a real ambulkdelete() call) when no index asks for it because
bottom-up deletion works very well in every index. Clearly we will
need to eventually remove remaining LP_DEAD items from the heap at
some point if nothing else happens -- eventually LP_DEAD items in the
heap alone will force a traditional heap vacuum (which will still have
to go through indexes that have not grown, just to be safe/avoid
recycling a TID that's still in the index).

Postgres heap fillfactor is 100 by default, though I believe it's 90
in another well known DB system. If you set Postgres heap fill factor
to 90 you can fit a little over 200 LP_DEAD items in the "extra space"
left behind in each heap page after initial bulk loading/INSERTs take
place that respect our lower fill factor setting. This is about 4x the
number of initial heap tuples in the pgbench_accounts table -- it's
quite a lot!

If we pessimistically assume that all updates are non-HOT updates,
we'll still usually have enough space for each logical row to get
updated several times before the heap page "overflows". Even when
there is significant skew in the UPDATEs, the skew is not noticeable
at the level of individual heap pages. We have a surprisingly large
general capacity to temporarily "absorb" extra garbage LP_DEAD items
in heap pages this way. Nobody really cared about this extra capacity
very much before now, because it did not help with the big problem of
index bloat that you naturally see with this workload. But that big
problem may go away soon, and so this extra capacity may become
important at the same time.

I think that it could make sense for lazy_scan_heap() to maintain
statistics about the number of LP_DEAD items remaining in each heap
page (just local stack variables). From there, it can pass the
statistics to the choose_vacuum_strategy() function from your patch.
Perhaps choose_vacuum_strategy() will notice that the heap page with
the most LP_DEAD items encountered within lazy_scan_heap() (among
those encountered so far in the event of multiple index passes) has
too many LP_DEAD items -- this indicates that there is a danger that
some heap pages will start to "overflow" soon, which is now a problem
that lazy_scan_heap() must think about. Maybe if the "extra space"
left by applying heap fill factor (with settings below 100) is
insufficient to fit perhaps 2/3 of the LP_DEAD items needed on the
heap page that has the most LP_DEAD items (among all heap pages), we
stop caring about what amvacuumstrategy()/the indexes say. So we do
the right thing for the heap pages, while still mostly avoiding index
vacuuming and the final heap pass.

I experimented with this today, and I think that it is a good way to
do it. I like the idea of choose_vacuum_strategy() understanding that
heap pages that are subject to many non-HOT updates have a "natural
extra capacity for LP_DEAD items" that it must care about directly (at
least with non-default heap fill factor settings). My early testing
shows that it will often take a surprisingly long time for the most
heavily updated heap page to have more than about 100 LP_DEAD items.

> > I will need to experiment in order to improve my understanding of how
> > to make this cooperate with bottom-up index deletion. But that's
> > mostly just a question for my patch (and a relatively easy one).
>
> Yeah, I think we might need something like statistics about garbage
> per index so that individual index can make a different decision based
> on their status. For example, a btree index might want to skip
> ambulkdelete() if it has a few dead index tuples in its leaf pages. It
> could be on stats collector or on btree's meta page.

Right. I think that even a very conservative approach could work well.
For example, maybe we teach nbtree's amvacuumstrategy() routine to ask
to do a real ambulkdelete(), except in the extreme case where the
index is *exactly* the same size as it was after the last VACUUM.
This will happen regularly with bottom-up index deletion. Maybe that
approach is a bit too conservative, though.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tang, Haiying 2020-12-28 08:15:15 RE: [Patch] Optimize dropping of relation buffers using dlist
Previous Message osumi.takamichi@fujitsu.com 2020-12-28 07:29:40 RE: Disable WAL logging to speed up data loading