Re: New IndexAM API controlling index vacuum strategies

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: New IndexAM API controlling index vacuum strategies
Date: 2021-02-22 06:27:48
Message-ID: CAD21AoAtZb4+HJT_8RoOXvu4HM-Zd4HKS3YSMCH6+-W=bDyh-w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Feb 10, 2021 at 4:12 PM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
>
> On Tue, Feb 9, 2021 at 6:14 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > Thanks. I think that's very good if we resolve this recycling stuff
> > first then try the new approach to skip index vacuum in more cases.
> > That way, even if the vacuum strategy stuff took a very long time to
> > get committed over several major versions, we would not be affected by
> > deleted nbtree page recycling problem (at least for built-in index
> > AMs). Also, the approach of 6655a7299d8 itself is a good improvement
> > and seems straightforward to me.
>
> I'm glad that you emphasized this issue, because I came up with a
> solution that turns out to not be very invasive. At the same time it
> has unexpected advantages, liking improving amcheck coverage for
> deleted pages.

Sorry for the late response.

I've attached the patch that adds a check whether or not to do index
vacuum (and heap vacuum) if 1% of all heap pages have LP_DEAD line
pointer.

While developing this feature, I realized the following two things:

1. Whereas skipping index vacuum and heap vacuum is a very attractive
improvement, if we skip that by default I wonder if we need a way to
disable it. Vacuum plays a role in cleaning and diagnosing tables in
practice. So in a case where the table is bad state and the user wants
to clean all heap pages, it would be good to have a way to disable
this skipping behavior. One solution would be that index_cleanup
option has three different behaviors: on, auto (or smart), and off. We
enable this skipping behavior by default in ‘auto’ mode, but
specifying "INDEX_CLEANUP true” means to enforce index vacuum and
therefore disabling it.

---
2.
@@ -1299,6 +1303,7 @@ lazy_scan_heap(Relation onerel, VacuumParams
*params, LVRelStats *vacrelstats,
{
lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
all_visible = false;
+ has_dead_tuples = true;
continue;
}

I added the above change in the patch to count the number of heap
pages having at least one LP_DEAD line pointer. But it's weird to me
that we have never set has_dead_tuples true when we found an LP_DEAD
line pointer. Currently, we set it to false true in 'tupgone' case but
it seems to me that we should do that in this case as well since we
use this flag in the following check:

else if (PageIsAllVisible(page) && has_dead_tuples)
{
elog(WARNING, "page containing dead tuples is marked as
all-visible in relation \"%s\" page %u",
vacrelstats->relname, blkno);
PageClearAllVisible(page);
MarkBufferDirty(buf);
visibilitymap_clear(onerel, blkno, vmbuffer,
VISIBILITYMAP_VALID_BITS);
}

Regards,

--
Masahiko Sawada
EDB: https://www.enterprisedb.com/

Attachment Content-Type Size
skip_index_vacuum.patch application/octet-stream 6.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Vik Fearing 2021-02-22 07:00:47 Re: Catalog version access
Previous Message Amit Langote 2021-02-22 06:04:42 Re: a misbehavior of partition row movement (?)