Re: New IndexAM API controlling index vacuum strategies

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Noah Misch <noah(at)leadboat(dot)com>
Subject: Re: New IndexAM API controlling index vacuum strategies
Date: 2021-03-16 22:20:43
Message-ID: CAH2-WzmgH3ySGYeC-m-eOBsa2=sDwa292-CFghV4rESYo39FsQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 16, 2021 at 6:40 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > Note that I've merged multiple existing functions in vacuumlazy.c into
> > one: the patch merges lazy_vacuum_all_indexes() and lazy_vacuum_heap()
> > into a single function named vacuum_indexes_mark_unused()

> I agree to create a function like vacuum_indexes_mark_unused() that
> makes a decision and does index and heap vacumming accordingly. But
> what is the point of removing both lazy_vacuum_all_indexes() and
> lazy_vacuum_heap()? I think we can simply have
> vacuum_indexes_mark_unused() call those functions. I'm concerned that
> if we add additional criteria to vacuum_indexes_mark_unused() in the
> future the function will become very large.

I agree now. I became overly excited about advertising the fact that
these two functions are logically one thing. This is important, but it
isn't necessary to go as far as actually making everything into one
function. Adding some comments would also make that point clear, but
without making vacuumlazy.c even more spaghetti-like. I'll fix it.

> > I wonder if we can add some kind of emergency anti-wraparound vacuum
> > logic to what I have here, for Postgres 14.

> +1
>
> I think we can set VACOPT_TERNARY_DISABLED to
> tab->at_params.index_cleanup in table_recheck_autovac() or increase
> the thresholds used to not skipping index vacuuming.

I was worried about the "tupgone = true" special case causing problems
when we decide to do some index vacuuming and some heap
vacuuming/LP_UNUSED-marking but then later decide to end the VACUUM.
But I now think that getting rid of "tupgone = true" gives us total
freedom to
choose what to do, including the freedom to start with index vacuuming
and then give up on it later -- even after we do some amount of
LP_UNUSED-marking (during a VACUUM with multiple index passes, perhaps
due to a low maintenance_work_mem setting). That isn't actually
special at all, because everything will be 100% decoupled.

Whether or not it's a good idea to either not start index vacuuming or
to end it early (e.g. due to XID wraparound pressure) is a complicated
question, and the right approach will be debatable in each case/when
thinking about each issue. However, deciding on the best behavior to
address these problems should have nothing to do with implementation
details and everything to do with the costs and benefits to users.
Which now seems possible.

A sophisticated version of the "XID wraparound pressure"
implementation could increase reliability without ever being
conservative, just by evaluating the situation regularly and being
prepared to change course when necessary -- until it is truly a matter
of shutting down new XID allocations/the server. It should be possible
to decide to end VACUUM early and advance relfrozenxid (provided we
have reached the point of finishing all required pruning and freezing,
of course). Highly agile behavior seems quite possible, even if it
takes a while to agree on a good design.

> Aside from whether it's good or bad, strictly speaking, it could
> change the index AM API contract. The documentation of
> amvacuumcleanup() says:
>
> ---
> stats is whatever the last ambulkdelete call returned, or NULL if
> ambulkdelete was not called because no tuples needed to be deleted.
> ---
>
> With this change, we could pass NULL to amvacuumcleanup even though
> the index might have tuples needed to be deleted.

I think that we should add a "Note" box to the documentation that
notes the difference here. Though FWIW my interpretation of the words
"no tuples needed to be deleted" was always "no tuples needed to be
deleted because vacuumlazy.c didn't call ambulkdelete()". After all,
VACUUM can add to tups_vacuumed through pruning inside
heap_prune_chain(). It is already possible (though only barely) to not
call ambulkdelete() for indexes (to only call amvacuumcleanup() during
cleanup) despite the fact that heap vacuuming did "delete tuples".

It's not that important, but my point is that the design has always
been top-down -- an index AM "needs to delete" whatever it is told it
needs to delete. It has no direct understanding of any higher-level
issues.

> As I mentioned in a recent reply, I'm concerned about a case where we
> ran out maintenance_work_mem and decided not to skip index vacuuming
> but collected only a few dead tuples in the second index vacuuming
> (i.g., the total amount of dead tuples is slightly larger than
> maintenance_work_mem). In this case, I think we can skip the second
> (i.g., final) index vacuuming at least in terms of
> maintenance_work_mem. Maybe the same is true in terms of LP_DEAD
> accumulation.

I remember that. That now seems very doable, but time grows short...

I have already prototyped Andres' idea, which was to eliminate the
HEAPTUPLE_DEAD case inside lazy_scan_heap() by restarting pruning for
the same page. I've also moved the pruning into its own function
called lazy_scan_heap_page(), because handling the restart requires
that we be careful about not incrementing things until we're sure we
won't need to repeat pruning.

This seems to work well, and the tests all pass. What I have right now
is still too rough to post to the list, though.

Even with a pg_usleep(10000) after the call to heap_page_prune() but
before the second/local HeapTupleSatisfiesVacuum() call, we almost
never actually hit the HEAPTUPLE_DEAD case. So the overhead must be
absolutely negligible. Adding a "goto restart" to the HEAPTUPLE_DEAD
case is ugly, but the "tupgone = true" thing is an abomination, so
that seems okay. This approach definitely seems like the way forward,
because it's obviously safe -- it may even be safer, because
heap_prepare_freeze_tuple() kind of expects this behavior today.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiro Ikeda 2021-03-16 23:09:27 RE: make the stats collector shutdown without writing the statsfiles if the immediate shutdown is requested.
Previous Message Hannu Krosing 2021-03-16 21:59:42 Re: Boundary value check in lazy_tid_reaped()