Re: New IndexAM API controlling index vacuum strategies

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Noah Misch <noah(at)leadboat(dot)com>
Subject: Re: New IndexAM API controlling index vacuum strategies
Date: 2021-03-16 13:39:47
Message-ID: CAD21AoCwpxF_7y=vv1zBcCBiuM2rfRgtZFMDn9JMw-F56Df54Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Mar 15, 2021 at 11:04 AM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
>
> On Thu, Mar 11, 2021 at 8:31 AM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> > But even if not, I'm not sure this
> > helps much with the situation you're concerned about, which involves
> > non-HOT tuples.
>
> Attached is a POC-quality revision of Masahiko's
> skip_index_vacuum.patch [1]. There is an improved design for skipping
> index vacuuming (improved over the INDEX_CLEANUP stuff from Postgres
> 12). I'm particularly interested in your perspective on this
> refactoring stuff, Robert, because you ran into the same issues after
> initial commit of the INDEX_CLEANUP reloption feature [2] -- you ran
> into issues with the "tupgone = true" special case. This is the case
> where VACUUM considers a tuple dead that was not marked LP_DEAD by
> pruning, and so needs to be killed in the second heap scan in
> lazy_vacuum_heap() instead. You'll recall that these issues were fixed
> by your commit dd695979888 from May 2019. I think that we need to go
> further than you did in dd695979888 for this -- we ought to get rid of
> the special case entirely.

Thank you for the patch!

>
> This patch makes the "VACUUM (INDEX_CLEANUP OFF)" mechanism no longer
> get invoked as if it was like the "no indexes on table so do it all in
> one heap pass" optimization. This seems a lot clearer -- INDEX_CLEANUP
> OFF isn't able to call lazy_vacuum_page() at all (for the obvious
> reason), so any similarity between the two cases was always
> superficial -- skipping index vacuuming should not be confused with
> doing a one-pass VACUUM/having no indexes at all. The original
> INDEX_CLEANUP structure (from commits a96c41fe and dd695979) always
> seemed confusing to me for this reason, FWIW.

Agreed.

>
> Note that I've merged multiple existing functions in vacuumlazy.c into
> one: the patch merges lazy_vacuum_all_indexes() and lazy_vacuum_heap()
> into a single function named vacuum_indexes_mark_unused() (note also
> that lazy_vacuum_page() has been renamed to mark_unused_page() to
> reflect the fact that it is now strictly concerned with making LP_DEAD
> line pointers LP_UNUSED). The big idea is that there is one choke
> point that decides whether index vacuuming is needed at all at one
> point in time, dynamically. vacuum_indexes_mark_unused() decides this
> for us at the last moment. This can only happen during a VACUUM that
> has enough memory to fit all TIDs -- otherwise we won't skip anything
> dynamically.
>
> We may in the future add additional criteria for skipping index
> vacuuming. That can now just be added to the beginning of this new
> vacuum_indexes_mark_unused() function. We may even teach
> vacuum_indexes_mark_unused() to skip some indexes but not others in a
> future release, a possibility that was already discussed at length
> earlier in this thread. This new structure has all the context it
> needs to do all of these things.

I agree to create a function like vacuum_indexes_mark_unused() that
makes a decision and does index and heap vacumming accordingly. But
what is the point of removing both lazy_vacuum_all_indexes() and
lazy_vacuum_heap()? I think we can simply have
vacuum_indexes_mark_unused() call those functions. I'm concerned that
if we add additional criteria to vacuum_indexes_mark_unused() in the
future the function will become very large.

>
> I wonder if we can add some kind of emergency anti-wraparound vacuum
> logic to what I have here, for Postgres 14. Can we come up with logic
> that has us skip index vacuuming because XID wraparound is on the
> verge of causing an outage? That seems like a strategically important
> thing for Postgres, so perhaps we should try to get something like
> that in. Practically every post mortem blog post involving Postgres
> also involves anti-wraparound vacuum.

+1

I think we can set VACOPT_TERNARY_DISABLED to
tab->at_params.index_cleanup in table_recheck_autovac() or increase
the thresholds used to not skipping index vacuuming.

>
> One consequence of my approach is that we now call
> lazy_cleanup_all_indexes(), even when we've skipped index vacuuming
> itself. We should at least "check-in" with the indexes IMV. To an
> index AM, this will be indistinguishable from a VACUUM that never had
> tuples for it to delete, and so never called ambulkdelete() before
> calling amvacuumcleanup(). This seems logical to me: why should there
> be any significant behavioral divergence between the case where there
> are 0 tuples to delete and the case where there is 1 tuple to delete?
> The extra work that we perform in amvacuumcleanup() (if any) should
> almost always be a no-op in nbtree following my recent refactoring
> work. More generally, if an index AM is doing too much during cleanup,
> and this becomes a bottleneck, then IMV that's a problem that needs to
> be fixed in the index AM.

Aside from whether it's good or bad, strictly speaking, it could
change the index AM API contract. The documentation of
amvacuumcleanup() says:

---
stats is whatever the last ambulkdelete call returned, or NULL if
ambulkdelete was not called because no tuples needed to be deleted.
---

With this change, we could pass NULL to amvacuumcleanup even though
the index might have tuples needed to be deleted.

>
> Masahiko: Note that I've also changed the SKIP_VACUUM_PAGES_RATIO
> logic to never reset the count of heap blocks with one or more LP_DEAD
> line pointers, per remarks in a recent email [5] -- that's now a table
> level count of heap blocks. What do you think of that aspect?

Yeah, I agree with that change.

As I mentioned in a recent reply, I'm concerned about a case where we
ran out maintenance_work_mem and decided not to skip index vacuuming
but collected only a few dead tuples in the second index vacuuming
(i.g., the total amount of dead tuples is slightly larger than
maintenance_work_mem). In this case, I think we can skip the second
(i.g., final) index vacuuming at least in terms of
maintenance_work_mem. Maybe the same is true in terms of LP_DEAD
accumulation.

> (BTW, I
> pushed your fix for the "not setting has_dead_tuples/has_dead_items
> variable" issue today, just to get it out of the way.)

Thanks!

Regards,

--
Masahiko Sawada
EDB: https://www.enterprisedb.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2021-03-16 13:45:28 Re: pg_subscription - substream column?
Previous Message Amit Langote 2021-03-16 13:37:56 Re: simplifying foreign key/RI checks