Re: GUC for cleanup indexes threshold.

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: klaussfreire(at)gmail(dot)com
Cc: pg(at)bowt(dot)ie, andres(at)anarazel(dot)de, sawada(dot)mshk(at)gmail(dot)com, robertmhaas(at)gmail(dot)com, david(at)pgmasters(dot)net, amit(dot)kapila16(at)gmail(dot)com, simon(at)2ndquadrant(dot)com, ideriha(dot)takeshi(at)jp(dot)fujitsu(dot)com, pgsql-hackers(at)postgresql(dot)org, kuntalghosh(dot)2007(at)gmail(dot)com
Subject: Re: GUC for cleanup indexes threshold.
Date: 2017-09-22 07:46:54
Message-ID: 20170922.164654.218262634.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I apologize in advance of possible silliness.

At Thu, 21 Sep 2017 13:54:01 -0300, Claudio Freire <klaussfreire(at)gmail(dot)com> wrote in <CAGTBQpYvgdqxVaiyui=BKrzw7ZZfTQi9KECUL4-Lkc2ThqX8QQ(at)mail(dot)gmail(dot)com>
> On Tue, Sep 19, 2017 at 8:55 PM, Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> > On Tue, Sep 19, 2017 at 4:47 PM, Claudio Freire <klaussfreire(at)gmail(dot)com> wrote:
> >> Maybe this is looking at the problem from the wrong direction.
> >>
> >> Why can't the page be added to the FSM immediately and the check be
> >> done at runtime when looking for a reusable page?
> >>
> >> Index FSMs currently store only 0 or 255, couldn't they store 128 for
> >> half-recyclable pages and make the caller re-check reusability before
> >> using it?
> >
> > No, because it's impossible for them to know whether or not the page
> > that their index scan just landed on recycled just a second ago, or
> > was like this since before their xact began/snapshot was acquired.
> >
> > For your reference, this RecentGlobalXmin interlock stuff is what
> > Lanin & Shasha call "The Drain Technique" within "2.5 Freeing Empty
> > Nodes". Seems pretty hard to do it any other way.
>
> I don't see the difference between a vacuum run and distributed
> maintainance at _bt_getbuf time. In fact, the code seems to be in
> place already.

The pages prohibited to register as "free" by RecentGlobalXmin
cannot be grabbed _bt_getbuf since the page is liked from nowhere
nor FSM doesn't offer the pages is "free".

> _bt_page_recyclable seems to prevent old transactions from treating
> those pages as recyclable already, and the description of the
> technique in 2.5 doesn't seem to preclude doing the drain while doing
> other operations. In fact, Lehman even considers the possibility of
> multiple concurrent garbage collectors.

_bt_page_recyclable prevent a vacuum scan from discarding pages
that might be looked from any active transaction, and the "drain"
itself is a technique to prevent freeing still-active pages so a
scan using the "drain" technique is freely executed
simultaneously with other transactions. The paper might allow
concurrent GCs (or vacuums) but our nbtree is saying that no
concurrent vacuum is assumed. Er... here it is.

nbtpages.c:1589: _bt_unlink_halfdead_page
| * right. This search could fail if either the sibling or the target page
| * was deleted by someone else meanwhile; if so, give up. (Right now,
| * that should never happen, since page deletion is only done in VACUUM
| * and there shouldn't be multiple VACUUMs concurrently on the same
| * table.)

> It's only a matter of making the page visible in the FSM in a way that
> can be efficiently skipped if we want to go directly to a page that
> actually CAN be recycled to avoid looping forever looking for a
> recyclable page in _bt_getbuf. In fact, that's pretty much Lehman's

Mmm. What _bt_getbuf does is recheck the page given from FSM as a
"free page". If FSM gives no more page, it just tries to extend
the index relation. Or am I reading you wrongly?

> drain technique right there. FSM entries with 128 would be "the
> queue", and FSM entries with 255 would be "the freelist". _bt_getbuf
> can be the GC getting a buffer to try and recycle, give up after a few
> tries, and get an actual recycleable buffer instead (or extend the
> relationship). In essence, microvacuum.
>
> Unless I'm missing something and RecentGlobalXmin somehow doesn't
> exclude all old transactions, I don't see a problem.
>
> Lanin & Shasha use reference counting to do GC wait during the drain,
> and most of the ordering of operations needed there is because of
> that, but using the xmin seems to make all those considerations moot.
> An xact earlier than RecentGlobalXmin cannot have active transactions
> able to follow links to that page AFAIK.
>
> TBH, I didn't read the whole papers, though I probably will.
>
> But, in essence, what's the difference of vacuum doing
>
> if (_bt_page_recyclable(page))
> {
> /* Okay to recycle this page */
> RecordFreeIndexPage(rel, blkno);
> vstate->totFreePages++;
> stats->pages_deleted++;
> }
>
> VS doing it in _bt_getbuf?
>
> What am I missing?

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Moser 2017-09-22 07:59:58 Re: [PROPOSAL] Temporal query processing with range types
Previous Message Julien Rouhaud 2017-09-22 07:46:40 Re: pg_stat_wal_write statistics view