Re: [HACKERS] GUC for cleanup indexes threshold.

From: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: Peter Geoghegan <pg(at)bowt(dot)ie>, Stephen Frost <sfrost(at)snowman(dot)net>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] GUC for cleanup indexes threshold.
Date: 2018-02-27 17:00:23
Message-ID: CAPpHfdsmThnWCz1i=ayzGag0A_QdubtrEZxWRSUM8NMYyEtScQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jan 12, 2018 at 7:05 AM, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
wrote:

> On Wed, Jan 10, 2018 at 11:28 AM, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
> wrote:
> > On Sun, Jan 7, 2018 at 8:40 AM, Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> >> On Sat, Jan 6, 2018 at 2:20 PM, Stephen Frost <sfrost(at)snowman(dot)net>
> wrote:
> >>>> > IIRC the patches that makes the cleanup scan skip has a problem
> >>>> > pointed by Peter[1], that is that we stash an XID when a btree page
> is
> >>>> > deleted, which is used to determine when it's finally safe to
> recycle
> >>>> > the page. Yura's patch doesn't have that problem?
> >>>> >
> >>>> > [1] https://www.postgresql.org/message-id/CAH2-Wz%3D1%
> 3Dt5fcGGfarQGcAWBqaCh%2BdLMjpYCYHpEyzK8Qg6OrQ%40mail.gmail.com
> >>
> >>> Masahiko Sawada, if this patch isn't viable or requires serious rework
> >>> to be acceptable, then perhaps we should change its status to 'returned
> >>> with feedback' and you can post a new patch for the next commitfest..?
> >>
> >> I believe that the problem that I pointed out with freezing/wraparound
> >> is a solvable problem. If we think about it carefully, we will come up
> >> with a good solution. I have tried to get the ball rolling with my
> >> pd_prune_xid suggestion. I think it's important to not lose sight of
> >> the fact that the page deletion/recycling XID thing is just one detail
> >> that we need to think about some more.
> >>
> >> I cannot fault Sawada-san for waiting to hear other people's views
> >> before proceeding. It really needs to be properly discussed.
> >>
> >
> > Thank you for commenting.
> >
> > IIUC we have two approaches: one idea is based on Peter's suggestion.
> > We can use pd_prune_xid to store epoch of xid of which the page is
> > deleted. That way, we can correctly mark deleted pages as recyclable
> > without breaking on-disk format.
> >
> > Another idea is suggested by Sokolov Yura. His original patch makes
> > btree have a flag in btpo_flags that implies the btree has deleted but
> > not recyclable page or not. I'd rather want to store it as bool in
> > BTMetaPageData. Currently btm_version is defined as uint32 but I think
> > we won't use all of them. If we store it in part of btm_version we
> > don't break on-disk format. However, we're now assuming that the
> > vacuum on btree index always scans whole btree rather than a part, and
> > this approach will nurture it more. It might be possible that it will
> > become a restriction in the future.
> >
>
> On third thought, it's similar to what we are doing for heaps but we
> can store the oldest btpo.xact of a btree index to somewhere(metapage
> or pg_class.relfrozenxid?) after vacuumed. We can skip cleanup
> vacuuming as long as the xid is not more than a certain threshold old
> (say vacuum_index_cleanup_age). Combining with new parameter I
> proposed before, the condition of doing cleanup index vacuum will
> become as follows.
>
> if (there is garbage on heap)
> Vacuum index, and update oldest btpo.xact
> else /* there is no garbage on heap */
> {
> if (# of scanned pages > (nblocks *
> vacuum_cleanup_index_scale_factor) OR oldest btpo.xact is more than
> vacuum_index_cleanup_age old?))
> Cleanup vacuum index, and update oldest btpo.xact
> }
>
> In current index vacuuming, it scans whole index pages if there is
> even one garbage on heap(I'd like to improve it though someday), which
> it also lead to update the oldest btpo.xact at the time. If
> vacuum_cleanup_index_slace_factor is enough high, we can skip the
> scanning whole index as long as either the table isn't modified for 2
> billion transactions or the oldest btpo.xact isn't more than
> vacuum_index_cleanup_age transactions old. But if there is a opened
> transaction for a very long time, we might not be able to mark deleted
> page as recyclable. Some tricks might be required. Feedback is
> welcome.

Storing oldest btpo.xact seems like a good idea for me. So, by comparing
oldest btpo.xact to RecentGlobalXmin we can determine if there are
some recyclable pages which are really going to be recycled during
possible cleanup? Assuming that index pages deletions are rare, that
seems like sufficient criterion to run the cleanup. Also, this criterion
will eliminate the risk of wraparound, because we'll recycle index pages
not later than we do now.

Another case when running cleanup is necessary (or at least desirable)
is stalled index statistics. Assuming that index statistics can be stalled
only when there is no deletions from index (otherwise index would
be scanned during vacuum), we can use number of index insertions as
measure of statistics oldness.

So, what about following logic?

if (there is garbage on heap)
Vacuum index, and update oldest btpo.xact
else /* there is no garbage on heap */
{
if (number of inserted index tuples since last btvacuumscan() /
total number of index tuples >= some threshold
OR oldest btpo.xact is older than RecentGlobalXmin)
Cleanup vacuum index, and update oldest btpo.xact
}

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2018-02-27 17:49:10 Re: atomics.h may not be included from frontend code
Previous Message Andrey Borodin 2018-02-27 16:51:29 Re: Online enabling of checksums