Re: Proposal: q-gram GIN and GiST indexes

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal: q-gram GIN and GiST indexes
Date: 2011-04-05 15:02:51
Message-ID: 3095.1302015771@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Alexander Korotkov <aekorotkov(at)gmail(dot)com> writes:
> On Tue, Apr 5, 2011 at 5:05 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> I am probably being stupid here, but doesn't the number of links to
>> rows grow proportionately to the number of n-grams?

> Number of links to rows grow proportionally to total number of extracted
> q-grams, but not proportionally to number of unique q-grams.

Sure. The number of links is exactly proportional to the size of the
text, no? An n-character text contains exactly n-q+1 q-grams, no more,
no less. You might have some rules that cause you to discard some of
them, but basically the TID portion of the index will be proportional
to data volume, with no measurable dependence on q.

Or at least that's what it seems like before I've had my morning
caffeine fix...

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Kevin Grittner 2011-04-05 15:02:55 Re: Re: synchronous_commit and synchronous_replication Re: [COMMITTERS] pgsql: Efficient transaction-controlled synchronous replication.
Previous Message Jim Nasby 2011-04-05 14:59:30 Re: Recursive containment of composite types