Quick Links

Re: gsoc, text search selectivity and dllist enhancments

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	"Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
Cc:	Jan Urbański <j(dot)urbanski(at)students(dot)mimuw(dot)edu(dot)pl>, "Postgres - Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: gsoc, text search selectivity and dllist enhancments
Date:	2008-07-04 15:53:56
Message-ID:	23365.1215186836@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

"Heikki Linnakangas" <heikki(at)enterprisedb(dot)com> writes:
> Tom Lane wrote:
>> The data structure I'd suggest is a simple array of pointers
>> to the underlying hash table entries. Since you have a predetermined
>> maximum number of lexemes to track, you can just palloc the array once
>> --- you don't need the expansibility properties of a list.

> The number of lexemes isn't predetermined. It's 2 * (longest tsvector
> seen so far), and we don't know beforehand how long the longest tsvector is.

Hmm, I had just assumed without looking too closely that it was stats
target times a fudge factor. What is the rationale for doing it as
above? I don't think I like the idea of the limit varying over the
course of the scan --- that means that lexemes in different places
in the input will have significantly different probabilities of
surviving to the final result.

regards, tom lane

In response to

Re: gsoc, text search selectivity and dllist enhancments at 2008-07-04 07:32:32 from Heikki Linnakangas

Responses

Re: gsoc, text search selectivity and dllist enhancments at 2008-07-04 19:20:08 from Heikki Linnakangas

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Tom Lane	2008-07-04 16:01:12	Re: [PATCHES] Explain XML patch v2
Previous Message	Alvaro Herrera	2008-07-04 15:05:40	Re: Review: DTrace probes (merged version)