Re: [HACKERS] GUC for cleanup indexes threshold.

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru>, Peter Geoghegan <pg(at)bowt(dot)ie>, Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, David Steele <david(at)pgmasters(dot)net>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, "Ideriha, Takeshi" <ideriha(dot)takeshi(at)jp(dot)fujitsu(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>, pgsql-hackers-owner(at)postgresql(dot)org
Subject: Re: [HACKERS] GUC for cleanup indexes threshold.
Date: 2018-03-05 02:56:00
Message-ID: CAD21AoCEowCyd5KpV5XAj+41gJShE8AXW2woXHweRFRfX577Bw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Mar 4, 2018 at 8:59 AM, Alexander Korotkov
<a(dot)korotkov(at)postgrespro(dot)ru> wrote:
> On Fri, Mar 2, 2018 at 10:53 AM, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
> wrote:
>>
>> > 2) In the append-only case, index statistics can lag indefinitely.
>>
>> The original proposal proposed a new GUC that specifies a fraction of
>> the modified pages to trigger a cleanup indexes.
>
>
> Regarding original proposal, I didn't get what exactly it's intended to be.
> You're checking if vacuumed_pages >= nblocks * vacuum_cleanup_index_scale.
> But vacuumed_pages is the variable which could be incremented when
> no indexes exist on the table. When indexes are present, this variable is
> always
> zero. I can assume, that it's intended to compare number of pages where
> at least one tuple is deleted to nblocks * vacuum_cleanup_index_scale.
> But that is also not an option for us, because we're going to optimize the
> case when exactly zero tuples is deleted by vacuum.

In the latest v4 patch, I compare scanned_pages and the threshold,
which means if the number of pages that are modified since the last
vacuum is larger than the threshold we force cleanup index.

> The thing I'm going to propose is to add estimated number of tuples in
> table to IndexVacuumInfo. Then B-tree can memorize that number of tuples
> when last time index was scanned in the meta-page. If pass value
> is differs from the value in meta-page too much, then cleanup is forced.
>
> Any better ideas?

I think that would work. But I'm concerned about metapage format
compatibility. And since I've not fully investigated about cleanup
index of other index types I'm not sure that interface makes sense. It
might not be better but an alternative idea is to add a condition
(Irel[i]->rd_rel->relam == BTREE_AM_OID) in lazy_scan_heap.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Gould 2018-03-05 03:19:18 Re: [patch] BUG #15005: ANALYZE can make pg_class.reltuples inaccurate.
Previous Message Amit Langote 2018-03-05 02:21:46 Re: non-bulk inserts and tuple routing