|From:||Jan Urbański <j(dot)urbanski(at)students(dot)mimuw(dot)edu(dot)pl>|
|To:||Alvaro Herrera <alvherre(at)commandprompt(dot)com>|
|Cc:||Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Heikki Linnakangas <heikki(at)enterprisedb(dot)com>, Postgres - Hackers <pgsql-hackers(at)postgresql(dot)org>|
|Subject:||Re: gsoc, text search selectivity and dllist enhancments|
|Views:||Raw Message | Whole Thread | Download mbox | Resend email|
Alvaro Herrera wrote:
> Jan Urbański wrote:
>> Oh, one important thing. You need to choose a bucket width for the LC
>> algorithm, that is decide after how many elements will you prune your
>> data structure. I chose to prune after every twenty tsvectors.
> Do you prune after X tsvectors regardless of the numbers of lexemes in
> them? I don't think that preserves the algorithm properties; if there's
> a bunch of very short tsvectors and then long tsvectors, the pruning
> would take place too early for the initial lexemes. I think you should
> count lexemes, not tsvectors.
Yes, that's what I was afraid of. I'm not sure why I was reluctant to
prune in the middle of a tsvector, maybe it's just in my head.
Still, there's a decision to be made: after how many lexemes should the
GPG key ID: E583D7D2
|Next Message||Tom Lane||2008-07-10 20:37:28||Re: gsoc, text search selectivity and dllist enhancments|
|Previous Message||Alvaro Herrera||2008-07-10 20:27:31||Re: gsoc, text search selectivity and dllist enhancments|