Re: New IndexAM API controlling index vacuum strategies

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Noah Misch <noah(at)leadboat(dot)com>
Subject: Re: New IndexAM API controlling index vacuum strategies
Date: 2021-03-26 01:58:30
Message-ID: CAH2-WzkeOSYwC6KNckbhk2b1aNnWum6Yyn0NKP9D-Hq1LGTDPw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Mar 24, 2021 at 6:05 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> I've attached the updated patch set (nothing changed in 0001 and 0002 patch).

Attached is v7, which takes the last two patches from your v6 and
rebases them on top of my recent work. This includes handling index
cleanup consistently in the event of an emergency. We want to do index
cleanup when the optimization case works out. It would be arbitrary to
not do index cleanup because there was 1 dead tuple instead of 0. Plus
index cleanup is actually useful in some index AMs. At the same time,
it seems like a bad idea to do cleanup in an emergency case. Note that
this includes the case where the new wraparound mechanism kicks in, as
well as the case where INDEX_CLEANUP = off. In general INDEX_CLEANUP =
off should be 100% equivalent to the emergency mechanism, except that
the decision is made dynamically instead of statically.

The two patches that you have been working on are combined into one
patch in v7 -- the last patch in the series. Maintaining two separate
patches there doesn't seem that useful.

The main change I've made in v7 is structural. There is a new patch in
the series, which is now the first. It adds more variables to the
top-level state variable used by VACUUM. We shouldn't have to pass the
same "onerel" variable and other similar variables to so many similar
functions. Plus we shouldn't rely on global state so much. That makes
the code a lot easier to understand. Another change that appears in
the first patch concerns parallel VACUUM, and how it is structured. It
is hard to know which function concerned parallel VACUUM and which is
broader than that right now. It makes it seriously hard to follow at
times. So I have consolidated those functions, and given them less
generic, more descriptive names. (In general it should be possible to
read most of the code in vacuumlazy.c without thinking about parallel
VACUUM in particular.)

I had many problems with existing function arguments that look like this:

IndexBulkDeleteResult **stats // this is a pointer to a pointer to a
IndexBulkDeleteResult.

Sometimes this exact spelling indicates: 1. "This is one particular
index's stats -- this function will have the index AM set the
statistics during ambulkdelete() and/or amvacuumcleanup()".

But at other times/with other function arguments, it indicates: 2.
"Array of stats, once for each of the heap relation's indexes".

I found the fact that both 1 and 2 appear together side by side very
confusing. It is much clearer with 0001-*, though. It establishes a
consistent singular vs plural variable naming convention. It also no
longer uses IndexBulkDeleteResult ** args for case 1 -- even the C
type system ambiguity is avoided. Any thoughts on my approach to this?

Another change in v7: We now stop applying any cost-based delay that
may be in force if and when we abandon index vacuuming to finish off
the VACUUM operation. Robert thought that that was important, and I
agree. I think that it's 100% justified, because this is a true
emergency. When the emergency mechanism (though not INDEX_CLEANUP=off)
actually kicks in, we now also have a scary WARNING. Since this
behavior only occurs when the system is definitely at very real risk
of becoming unavailable, we can justify practically any intervention
that makes it less likely that the system will become 100% unavailable
(except for anything that creates additional risk of data loss).

BTW, spotted this compiler warning in v6:

/code/postgresql/patch/build/../source/src/backend/access/heap/vacuumlazy.c:
In function ‘check_index_vacuum_xid_limit’:
/code/postgresql/patch/build/../source/src/backend/access/heap/vacuumlazy.c:2314:6:
warning: variable ‘effective_multixact_freeze_max_age’ set but not
used [-Wunused-but-set-variable]
2314 | int effective_multixact_freeze_max_age;
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

I think that it's just a leftover chunk of code. The variable in
question ('effective_multixact_freeze_max_age') does not appear in v7,
in any case. BTW I moved this function into vacuum.c, next to
vacuum_set_xid_limits() -- that seemed like a better place for it. But
please check this yourself.

> Regarding "auto" option, I think it would be a good start to enable
> the index vacuum skipping behavior by default instead of adding “auto”
> mode. That is, we could skip index vacuuming if INDEX_CLEANUP ON. With
> 0003 and 0004 patch, there are two cases where we skip index
> vacuuming: the garbage on heap is very concentrated and the table is
> at risk of XID wraparound. It seems to make sense to have both
> behaviors by default.

I agree. Adding a new "auto" option now seems to me to be unnecessary
complexity. Besides, switching a boolean reloption to a bool-like enum
reloption may have subtle problems.

> If we want to have a way to force doing index
> vacuuming, we can add “force” option instead of adding “auto” option
> and having “on” mode force doing index vacuuming.

It's hard to imagine anybody using the "force option". Let's not have
one. Let's not change the fact that "INDEX_CLEANUP = on" means
"default index vacuuming behavior". Let's just change the index
vacuuming behavior. If we get the details wrong, a simple reloption
will make little difference. We're already being fairly conservative
in terms of the skipping behavior, including with the
SKIP_VACUUM_PAGES_RATIO.

> Also regarding new GUC parameters, vacuum_skip_index_age and
> vacuum_multixact_skip_index_age, those are not autovacuum-dedicated
> parameter. VACUUM command also uses those parameters to skip index
> vacuuming dynamically. In such an emergency case, it seems appropriate
> to me to skip index vacuuming even in VACUUM command. And I don’t add
> any reloption for those two parameters. Since those parameters are
> unlikely to be changed from the default value, I think it don’t
> necessarily need to provide a way for per-table configuration.

+1 for all that. We already have a reloption for this behavior, more
or less -- it's called INDEX_CLEANUP.

The existing autovacuum_freeze_max_age GUC (which is highly related to
your new GUCs) is both an autovacuum GUC, and somehow also not an
autovacuum GUC at the same time. The apparent contradiction only seems
to resolve itself when you consider the perspective of DBAs and the
perspective of Postgres hackers separately.

*Every* VACUUM GUC is an autovacuum GUC when you know for sure that
the relfrozenxid is 1 billion+ XIDs in the past.

> + bool skipping;

> Can we flip the boolean? I mean to use a positive form such as
> "do_vacuum". It seems to be more readable especially for the changes
> made in 0003 and 0004 patches.

I agree that it's clearer that way around.

The code is structured this way in v7. Specifically, there are now
both do_index_vacuuming and do_index_cleanup in the per-VACUUM state
struct in patch 0002-*. These are a direct replacement for useindex.
(Though we only start caring about the do_index_cleanup field in the
final patch, 0004-*)

Thanks
--
Peter Geoghegan

Attachment Content-Type Size
v7-0004-Skip-index-vacuuming-in-some-cases.patch application/octet-stream 25.7 KB
v7-0003-Remove-tupgone-special-case-from-vacuumlazy.c.patch application/octet-stream 43.2 KB
v7-0001-Centralize-state-for-each-VACUUM.patch application/octet-stream 111.9 KB
v7-0002-Break-lazy_scan_heap-up-into-functions.patch application/octet-stream 59.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2021-03-26 02:28:58 Re: Replication slot stats misgivings
Previous Message Amit Langote 2021-03-26 01:51:53 Re: making update/delete of inheritance trees scale better