Re: GUC for cleanup indexes threshold.

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: sawada(dot)mshk(at)gmail(dot)com
Cc: pg(at)bowt(dot)ie, andres(at)anarazel(dot)de, robertmhaas(at)gmail(dot)com, david(at)pgmasters(dot)net, amit(dot)kapila16(at)gmail(dot)com, simon(at)2ndquadrant(dot)com, ideriha(dot)takeshi(at)jp(dot)fujitsu(dot)com, pgsql-hackers(at)postgresql(dot)org, kuntalghosh(dot)2007(at)gmail(dot)com
Subject: Re: GUC for cleanup indexes threshold.
Date: 2017-09-22 08:31:10
Message-ID: 20170922.173110.253964775.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At Fri, 22 Sep 2017 17:21:04 +0900, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote in <CAD21AoBN9ucgMDuinx2ptU8upEToHnR-A35aBcQyZnLFvWdVPg(at)mail(dot)gmail(dot)com>
> On Fri, Sep 22, 2017 at 4:16 PM, Kyotaro HORIGUCHI
> <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> > At Fri, 22 Sep 2017 15:00:20 +0900, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote in <CAD21AoD6zgb1W6ps1aXj0CcAB_chDYiiTNtEdpMhkefGg13-GQ(at)mail(dot)gmail(dot)com>
> >> On Tue, Sep 19, 2017 at 3:31 PM, Kyotaro HORIGUCHI
> >> <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> >> Could you elaborate about this? For example in btree index, the index
> >> cleanup skips to scan on the index scan if index_bulk_delete has been
> >> called during vacuuming because stats != NULL. So I think we don't
> >> need such a flag.
> >
> > The flag works so that successive two index full scans don't
> > happen in a vacuum round. If any rows are fully deleted, just
> > following btvacuumcleanup does nothing.
> >
> > I think what you wanted to solve here was the problem that
> > index_vacuum_cleanup runs a full scan even if it ends with no
> > actual work, when manual or anti-wraparound vacuums. (I'm
> > getting a bit confused on this..) It is caused by using the
> > pointer "stats" as the flag to instruct to do that. If the
> > stats-as-a-flag worked as expected, the GUC doesn't seem to be
> > required.
>
> Hmm, my proposal is like that if a table doesn't changed since the
> previous vacuum much we skip the cleaning up index.
>
> If the table has at least one garbage we do the lazy_vacuum_index and
> then IndexBulkDeleteResutl is stored, which causes to skip doing the
> btvacuumcleanup. On the other hand, if the table doesn't have any
> garbage but some new tuples inserted since the previous vacuum, we
> don't do the lazy_vacuum_index but do the lazy_cleanup_index. In this
> case, we always do the lazy_cleanup_index (i.g, we do the full scan)
> even if only one tuple is inserted. That's why I proposed a new GUC
> parameter which allows us to skip the lazy_cleanup_index in the case.

I think the problem raised in this thread is that the last index
scan may leave dangling pages.

> > Addition to that, as Simon and Peter pointed out
> > index_bulk_delete can leave not-fully-removed pages (so-called
> > half-dead pages and pages that are recyclable but not registered
> > in FSM, AFAICS) in some cases mainly by RecentGlobalXmin
> > interlock. In this case, just inhibiting cleanup scan by a
> > threshold lets such dangling pages persist in the index. (I
> > conldn't make such a many dangling pages, though..)
> >
> > The first patch in the mail (*1) does that. It seems having some
> > bugs, though..
> >
> >
> > Since the dangling pages persist until autovacuum decided to scan
> > the belonging table again, we should run a vacuum round (or
> > index_vacuum_cleanup itself) even having no dead rows if we want
> > to clean up such pages within a certain period. The second patch
> > doesn that.
> >
>
> IIUC half-dead pages are not relevant to this proposal. The proposal
> has two problems;
>
> * By skipping index cleanup we could leave recyclable pages that are
> not marked as a recyclable.

Yes.

> * we stash an XID when a btree page is deleted, which is used to
> determine when it's finally safe to recycle the page

Is it a "problem" of this proposal?

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Bapat 2017-09-22 08:33:14 Re: Partition-wise join for join between (declaratively) partitioned tables
Previous Message Masahiko Sawada 2017-09-22 08:21:04 Re: GUC for cleanup indexes threshold.