Re: GIN - Generalized Inverted iNdex. Try 3.

From: Teodor Sigaev <teodor(at)sigaev(dot)ru>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Christopher Kings-Lynne <chris(dot)kings-lynne(at)calorieking(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: GIN - Generalized Inverted iNdex. Try 3.
Date: 2006-04-28 15:44:36
Message-ID: 44523864.1020604@sigaev.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>> ok. amskipcheck?
>
> Oh, I was thinking of having VACUUM put the heap tuple count into the
> struct and then amvacuumcleanup could copy it over to the index tuple
> count. A "skipcheck" flag only solves the cosmetic problem of not
> getting the warning, it doesn't fix things so that the correct count
> ends up in the index's reltuples entry.

Yes, it is. So for cosmetic purpose I suggested just a quick hack until we have
good solution.

Both ways, counting pointer and getting number from heap, are not good because of:
* counting pointers - it's not what planner is waiting
* getting number from heap - problems with null values and partial indexes.
And it will be done only pass check...

As you say, the best way is a fair count during vacuum, but for this
ginbulkdelete should collect in memory all pointers, uniques and counts it. But
how much memory it will takes for big tables? At least sizeof(ItemPointerData) *
number of tuples

Using non-exact calculation we may count approximate number of heap's tuples by
finding the greatest heap's blocknumber from pointers stored in index and
multiply it by density of tuples ( average number of heap's tuple on one heap's
page )... But in this case we should skip check too.

> OK, in that case we'd better add a real amclusterable flag to pg_am,
> rather than assuming amorderstrategy can be used to decide.
>
>> So, two columns about clustering?
>> amclustered
>> amclusterable
>
> Huh? Why two? Either you are allowed to cluster on indexes of this
> type, or you're not. I don't see the point of any other distinction.

amclusterable - as you suggest: Does cluster command something or not?
amclustered - table on such index is always clustered, cluster command does
nothing, but optimizer/planner takes clustering into
consideration for query planning.

We can use only amclustered, but in this case we can't forbid to cluster table
on any index. Just current situation.

--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2006-04-28 15:53:05 Re: GIN - Generalized Inverted iNdex. Try 3.
Previous Message Tom Lane 2006-04-28 15:05:45 Re: GIN - Generalized Inverted iNdex. Try 3.