Re: New IndexAM API controlling index vacuum strategies

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Noah Misch <noah(at)leadboat(dot)com>
Subject: Re: New IndexAM API controlling index vacuum strategies
Date: 2021-03-23 11:01:28
Message-ID: CAD21AoD89Ta6Ezu=y5x+ZcsW=8ZiaYzCKgDc7Exq-60Hi3UA6Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Mar 22, 2021 at 10:39 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> On Sat, Mar 20, 2021 at 11:05 AM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> >
> > On Wed, Mar 17, 2021 at 7:55 PM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> >
> > 2. A patch to remove the tupgone case.
> >
> > Severa new and interesting changes here -- see below.
> >
> > 3. The patch to optimize VACUUM by teaching it to skip index and heap
> > vacuuming in certain cases where we only expect a very small benefit.
>
> I’ll review the other two patches tomorrow.

Here are review comments on 0003 patch:

+ /*
+ * Check whether or not to do index vacuum and heap vacuum.
+ *
+ * We do both index vacuum and heap vacuum if more than
+ * SKIP_VACUUM_PAGES_RATIO of all heap pages have at least one LP_DEAD
+ * line pointer. This is normally a case where dead tuples on the heap
+ * are highly concentrated in relatively few heap blocks, where the
+ * index's enhanced deletion mechanism that is clever about heap block
+ * dead tuple concentrations including btree's bottom-up index deletion
+ * works well. Also, since we can clean only a few heap blocks, it would
+ * be a less negative impact in terms of visibility map update.
+ *
+ * If we skip vacuum, we just ignore the collected dead tuples. Note that
+ * vacrelstats->dead_tuples could have tuples which became dead after
+ * HOT-pruning but are not marked dead yet. We do not process them
+ * because it's a very rare condition, and the next vacuum will process
+ * them anyway.
+ */

The second paragraph is no longer true after removing the 'tupegone' case.

---
if (dead_tuples->num_tuples > 0)
two_pass_strategy(onerel, vacrelstats, Irel, indstats, nindexes,
- lps, params->index_cleanup);
+ lps, params->index_cleanup,
+ has_dead_items_pages, !calledtwopass);

Maybe we can use vacrelstats->num_index_scans instead of
calledtwopass? When calling to two_pass_strategy() at the end of
lazy_scan_heap(), if vacrelstats->num_index_scans is 0 it means this
is the first time call, which is equivalent to calledtwopass = false.

---
- params.index_cleanup = get_vacopt_ternary_value(opt);
+ {
+ if (opt->arg == NULL || strcmp(defGetString(opt), "auto") == 0)
+ params.index_cleanup = VACOPT_CLEANUP_AUTO;
+ else if (defGetBoolean(opt))
+ params.index_cleanup = VACOPT_CLEANUP_ENABLED;
+ else
+ params.index_cleanup = VACOPT_CLEANUP_DISABLED;
+ }

+ /*
+ * Set index cleanup option based on reloptions if not set to either ON or
+ * OFF. Note that an VACUUM(INDEX_CLEANUP=AUTO) command is interpreted as
+ * "prefer reloption, but if it's not set dynamically determine if index
+ * vacuuming and cleanup" takes place in vacuumlazy.c. Note also that the
+ * reloption might be explicitly set to AUTO.
+ *
+ * XXX: Do we really want that?
+ */
+ if (params->index_cleanup == VACOPT_CLEANUP_AUTO &&
+ onerel->rd_options != NULL)
+ params->index_cleanup =
+ ((StdRdOptions *) onerel->rd_options)->vacuum_index_cleanup;

Perhaps we can make INDEX_CLEANUP option a four-value option: on, off,
auto, and default? A problem with the above change would be that if
the user wants to do "auto" mode, they might need to reset
vacuum_index_cleanup reloption before executing VACUUM command. In
other words, there is no way in VACUUM command to force "auto" mode.
So I think we can add "auto" value to INDEX_CLEANUP option and ignore
the vacuum_index_cleanup reloption if that value is specified.

Are you updating also the 0003 patch? if you're focusing on 0001 and
0002 patch, I'll update the 0003 patch along with the fourth patch
(skipping index vacuum in emergency cases).

Regards,

--
Masahiko Sawada
EDB: https://www.enterprisedb.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2021-03-23 11:42:18 Re: Hybrid Hash/Nested Loop joins and caching results from subplans
Previous Message Peter Eisentraut 2021-03-23 10:52:22 Re: OpenSSL 3.0.0 compatibility