Re: New IndexAM API controlling index vacuum strategies

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: New IndexAM API controlling index vacuum strategies
Date: 2020-12-29 06:25:55
Message-ID: CAD21AoC7ceWtCS9WGO=M2qvW4Be0JGr7EPfVR=Ss2b5gR84Aww@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Dec 29, 2020 at 7:06 AM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
>
> On Sun, Dec 27, 2020 at 11:41 PM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> > I experimented with this today, and I think that it is a good way to
> > do it. I like the idea of choose_vacuum_strategy() understanding that
> > heap pages that are subject to many non-HOT updates have a "natural
> > extra capacity for LP_DEAD items" that it must care about directly (at
> > least with non-default heap fill factor settings). My early testing
> > shows that it will often take a surprisingly long time for the most
> > heavily updated heap page to have more than about 100 LP_DEAD items.
>
> Attached is a rough patch showing what I did here. It was applied on
> top of my bottom-up index deletion patch series and your
> poc_vacuumstrategy.patch patch. This patch was written as a quick and
> dirty way of simulating what I thought would work best for bottom-up
> index deletion for one specific benchmark/test, which was
> non-hot-update heavy. This consists of a variant pgbench with several
> indexes on pgbench_accounts (almost the same as most other bottom-up
> deletion benchmarks I've been running). Only one index is "logically
> modified" by the updates, but of course we still physically modify all
> indexes on every update. I set fill factor to 90 for this benchmark,
> which is an important factor for how your VACUUM patch works during
> the benchmark.
>
> This rough supplementary patch includes VACUUM logic that assumes (but
> doesn't check) that the table has heap fill factor set to 90 -- see my
> changes to choose_vacuum_strategy(). This benchmark is really about
> stability over time more than performance (though performance is also
> improved significantly). I wanted to keep both the table/heap and the
> logically unmodified indexes (i.e. 3 out of 4 indexes on
> pgbench_accounts) exactly the same size *forever*.
>
> Does this make sense?

Thank you for sharing the patch. That makes sense.

+ if (!vacuum_heap)
+ {
+ if (maxdeadpage > 130 ||
+ /* Also check if maintenance_work_mem space is running out */
+ vacrelstats->dead_tuples->num_tuples >
+ vacrelstats->dead_tuples->max_tuples / 2)
+ vacuum_heap = true;
+ }

The second test checking if maintenane_work_mem space is running out
also makes sense to me. Perhaps another idea would be to compare the
number of collected garbage tuple to the total number of heap tuples
so that we do lazy_vacuum_heap() only when we’re likely to reclaim a
certain amount of garbage in the table.

>
> Anyway, with a 15k TPS limit on a pgbench scale 3000 DB, I see that
> pg_stat_database shows an almost ~28% reduction in blks_read after an
> overnight run for the patch series (it was 508,820,699 for the
> patches, 705,282,975 for the master branch). I think that the VACUUM
> component is responsible for some of that reduction. There were 11
> VACUUMs for the patch, 7 of which did not call lazy_vacuum_heap()
> (these 7 VACUUM operations all only dead a btbulkdelete() call for the
> one problematic index on the table, named "abalance_ruin", which my
> supplementary patch has hard-coded knowledge of).

That's a very good result in terms of skipping lazy_vacuum_heap(). How
much the table and indexes bloated? Also, I'm curious about that which
tests in choose_vacuum_strategy() turned vacuum_heap on: 130 test or
test if maintenance_work_mem space is running out? And what was the
impact on clearing all-visible bits?

Regards,

--
Masahiko Sawada
EnterpriseDB: https://www.enterprisedb.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2020-12-29 07:20:44 Re: New IndexAM API controlling index vacuum strategies
Previous Message Masahiko Sawada 2020-12-29 06:18:55 Re: New IndexAM API controlling index vacuum strategies