Re: New IndexAM API controlling index vacuum strategies

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: New IndexAM API controlling index vacuum strategies
Date: 2021-03-03 04:49:08
Message-ID: CAD21AoAHoc+9gw3_sP_jxH3DweuFKWh2qi7ebjrmUPFcGcnVWQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 2, 2021 at 2:34 PM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
>
> On Mon, Mar 1, 2021 at 7:00 PM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> > I think that you're right. However, in practice it isn't harmful
> > because has_dead_tuples is only used when "all_visible = true", and
> > only to detect corruption (which should never happen). I think that it
> > should be fixed as part of this work, though.
>
> Currently the first callsite that calls the new
> lazy_vacuum_table_and_indexes() function in the patch
> ("skip_index_vacuum.patch") skips index vacuuming in exactly the same
> way as the second and final lazy_vacuum_table_and_indexes() call site.
> Don't we need to account for maintenance_work_mem in some way?
>
> lazy_vacuum_table_and_indexes() should probably not skip index
> vacuuming when we're close to exceeding the space allocated for the
> LVDeadTuples array. Maybe we should not skip when
> vacrelstats->dead_tuples->num_tuples is greater than 50% of
> dead_tuples->max_tuples? Of course, this would only need to be
> considered when lazy_vacuum_table_and_indexes() is only called once
> for the entire VACUUM operation (otherwise we have far too little
> maintenance_work_mem/dead_tuples->max_tuples anyway).

Doesn't it actually mean we consider how many dead *tuples* we
collected during a vacuum? I’m not sure how important the fact we’re
close to exceeding the maintenance_work_mem space. Suppose
maintenance_work_mem is 64MB, we will not skip both index vacuum and
heap vacuum if the number of dead tuples exceeds 5592404 (we can
collect 11184809 tuples with 64MB memory). But those tuples could be
concentrated in a small number of blocks, for example in a very large
table case. It seems to contradict the current strategy that we want
to skip vacuum if relatively few blocks are modified. No?

Regards,

--
Masahiko Sawada
EDB: https://www.enterprisedb.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2021-03-03 05:20:11 Re: buildfarm windows checks / tap tests on windows
Previous Message Tom Lane 2021-03-03 04:39:10 Re: Libpq support to connect to standby server as priority