Re: GUC for cleanup indexes threshold.

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, "Ideriha, Takeshi" <ideriha(dot)takeshi(at)jp(dot)fujitsu(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>
Subject: Re: GUC for cleanup indexes threshold.
Date: 2017-03-04 02:58:17
Message-ID: CAA4eK1+geeoXzGe516S=qaU-2+ZHvnZxVwBizXZp8dSFSqei-A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Mar 4, 2017 at 5:59 AM, Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> On Fri, Mar 3, 2017 at 2:41 PM, Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
>> In other words, the number of B-Tree pages that the last VACUUM
>> deleted, and thus made eligible to recycle by the next VACUUM has no
>> relationship with the number of pages the next VACUUM will itself end
>> up deleting, in general, or how long it will be before that next
>> VACUUM comes, if it comes at all, or anything else that seems at all
>> relevant.
>
> This raises another question, though: Why have this GUC at all? Why
> use *any* threshold that is to be compared against the number of heap
> pages that were processed by VACUUM this time?
>
> B-Tree page deletion isn't really part of the ordinary life cycle of a
> B-Tree index. In order for that to be the case, somebody would have to
> delete large ranges of indexed values (typically hundreds of logically
> contiguous values -- no gaps), without anyone else ever inserting new
> tuples that are in the same range before the next VACUUM. It's very
> unlikely that this would happen again and again in the real world. So,
> even if we never freeze, the number of B-Tree pages that we delete
> when we VACUUM today is generally a useless predictor of how many will
> be deleted by a VACUUM that occurs tomorrow. This is true despite the
> fact that the number of dead heap tuples is probably almost identical
> for each VACUUM (or the number of heap pages that end up being
> processed by VACUUM, if you prefer).
>
> Barring any concerns about crash safety, we can be completely certain
> that any "recycling-orientated B-Tree VACUUM" (a btvacuumcleanup()
> call to btvacuumscan(), which happens because there are no tuples in
> the index to kill) will end up recycling however many pages the last
> VACUUM managed to delete, which is a precisely knowable number (or
> could be made knowable if we stashed that number somewhere, like the
> meta-page). It will typically only take seconds or minutes after the
> VACUUM finishes for its RecentGlobalXmin interlock to stop being a
> problem (that is, for _bt_page_recyclable() to return "true" for any
> pages that that VACUUM deleted). From that point on, those deleted
> pages are "money in the bank" for the FSM. The only reason why we'd
> want to tie "the FSM withdrawing that money" to VACUUM is because that
> might be needed to clean up regular bloat anyway.
>
> The test performed by this patch within lazy_scan_heap(), to determine
> whether we should avoid calling lazy_cleanup_index() would therefore
> look like this, ideally: Do I want to go to the trouble of scanning
> this index (a cost that is proportionate to the size of the index) in
> order to recycle this number of known-deleted pages (a benefit that is
> proportionate to that number)? (I still think that the important thing
> is that we don't let the number of unclaimed-by-FSM recyclable pages
> grow forever, though.)
>

You are right that we don't want the number of unclaimed-by-FSM
recyclable pages to grow forever, but I think that won't happen with
this patch. As soon as there are more deletions (in heap), in the
next vacuum cycle, the pages will be reclaimed by lazy_vacuum_index().

> (Thinks about it some more...)
>
> Unfortunately, I just saw a whole new problem with this patch:
> _bt_page_recyclable() is the one place in the B-Tree AM where we stash
> an XID.
>

Can you be more specific to tell which code exactly you are referring here?

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2017-03-04 03:00:11 Re: [pgsql-www] Small issue in online devel documentation build
Previous Message Peter Eisentraut 2017-03-04 02:46:20 Re: [COMMITTERS] pgsql: Use asynchronous connect API in libpqwalreceiver