Re: GIN - Generalized Inverted iNdex. Try 3.

From: Teodor Sigaev <teodor(at)sigaev(dot)ru>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Christopher Kings-Lynne <chris(dot)kings-lynne(at)calorieking(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: GIN - Generalized Inverted iNdex. Try 3.
Date: 2006-04-28 13:41:18
Message-ID: 44521B7E.1050705@sigaev.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> There's a definitional issue here, which is what does it mean to be
> counting index tuples. I think GIN could bypass the VACUUM error check
> by always returning the heap tuple count as its index tuple count. This

One problem: ambulkdelete hasn't any access to heap or heap's statistics
(num_tuples in scan_index() and vacuum_index() in vacuum.c). So, ambulkdelete
can't set stats->num_index_tuples equal to num_tuples. With partial index
problem is increased...

After looking into vacuum.c I found following ways to skip check:
1) Simplest: just return NULL by ginvacuumcleanup. Disadvantage:
drop any statistics
2) Quick hack in vacuum.c to be fixed in a future:
if ( indrel->rd_rel->relam == GIN_AM_OID )
stats->num_index_tuples = num_tuples;
else if (stats->num_index_tuples != num_tuples ) {
checking as now
}
3) Add column to pg_am pointed to scan_index/vacuum_index's behaviour
like above. I don't think that column is frequent case - only for
inverted indexes.

If there is not objections, at Tuesday we add quick hack (2) and commit GIN.
After that our plan is:
1) add opclasses for other array
2) add indisclustered=true for all GIN indexes by changes in
UpdateIndexRelation() and mark_index_clustered(). The issue is:
can table be clustered on several indexes now? Because GIN is always 'clustered'
table can be clustered on several GIN index and one any other. Cluster command
on GIN index should do nothing. May be, it will be cleaner to add indclustered
column to pg_am.
3) Return to WAL problem with GiST
4) work on gincostesimate and, possibly, GIN's opclasses costestimate tweak...
Including num_tuples issue

--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2006-04-28 14:14:09 Re: GIN - Generalized Inverted iNdex. Try 3.
Previous Message Pavel Stehule 2006-04-28 11:36:49 plpgsql cant to set role from function. is bug?, SOLVED