Re: New IndexAM API controlling index vacuum strategies

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Noah Misch <noah(at)leadboat(dot)com>
Subject: Re: New IndexAM API controlling index vacuum strategies
Date: 2021-03-24 02:43:59
Message-ID: CAH2-Wzk8f+LEX9hrSxdj3Qj6QMNi1hDGLcF1pY2ffDfEn_9x=w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 23, 2021 at 4:02 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> Here are review comments on 0003 patch:

Attached is a new revision, v5. It fixes bit rot caused by recent
changes (your index autovacuum logging stuff). It has also been
cleaned up in response to your recent review comments -- both from
this email, and the other review email that I responded to separately
today.

> + * If we skip vacuum, we just ignore the collected dead tuples. Note that
> + * vacrelstats->dead_tuples could have tuples which became dead after
> + * HOT-pruning but are not marked dead yet. We do not process them
> + * because it's a very rare condition, and the next vacuum will process
> + * them anyway.
> + */
>
> The second paragraph is no longer true after removing the 'tupegone' case.

Fixed.

> Maybe we can use vacrelstats->num_index_scans instead of
> calledtwopass? When calling to two_pass_strategy() at the end of
> lazy_scan_heap(), if vacrelstats->num_index_scans is 0 it means this
> is the first time call, which is equivalent to calledtwopass = false.

It's true that when "vacrelstats->num_index_scans > 0" it definitely
can't have been the first call. But how can we distinguish between 1.)
the case where we're being called for the first time, and 2.) the case
where it's the second call, but the first call actually skipped index
vacuuming? When we skip index vacuuming we won't increment
num_index_scans (which seems appropriate to me).

For now I have added an assertion that "vacrelstats->num_index_scan ==
0" at the point where we apply skipping indexes as an optimization
(i.e. the point where the patch 0003- mechanism is applied).

> Perhaps we can make INDEX_CLEANUP option a four-value option: on, off,
> auto, and default? A problem with the above change would be that if
> the user wants to do "auto" mode, they might need to reset
> vacuum_index_cleanup reloption before executing VACUUM command. In
> other words, there is no way in VACUUM command to force "auto" mode.
> So I think we can add "auto" value to INDEX_CLEANUP option and ignore
> the vacuum_index_cleanup reloption if that value is specified.

I agree that this aspect definitely needs more work. I'll leave it to you to
do this in a separate revision of this new 0003 patch (so no changes here
from me for v5).

> Are you updating also the 0003 patch? if you're focusing on 0001 and
> 0002 patch, I'll update the 0003 patch along with the fourth patch
> (skipping index vacuum in emergency cases).

I suggest that you start integrating it with the wraparound emergency
mechanism, which can become patch 0004- of the patch series. You can
manage 0003- and 0004- now. You can post revisions of each of those
two independently of my revisions. What do you think? I have included
0003- for now because you had review comments on it that I worked
through, but you should own that, I think.

I suppose that you should include the versions of 0001- and 0002- you
worked off of, just for the convenience of others/to keep the CF
tester happy. I don't think that I'm going to make many changes that
will break your patch, except for obvious bit rot that can be fixed
through fairly mechanical rebasing.

Thanks
--
Peter Geoghegan

Attachment Content-Type Size
v5-0001-Refactor-vacuumlazy.c.patch application/x-patch 60.2 KB
v5-0002-Remove-tupgone-special-case-from-vacuumlazy.c.patch application/x-patch 43.3 KB
v5-0003-Skip-index-vacuuming-dynamically.patch application/x-patch 16.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Julien Rouhaud 2021-03-24 03:02:39 Re: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?
Previous Message Fujii Masao 2021-03-24 01:46:21 Re: Nicer error when connecting to standby with hot_standby=off