Re: tsearch_core for inclusion

From: Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>
To: "Florian G(dot) Pflug" <fgp(at)phlo(dot)org>
Cc: Teodor Sigaev <teodor(at)sigaev(dot)ru>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: tsearch_core for inclusion
Date: 2007-03-16 16:18:55
Message-ID: 45FAC36F.7020109@kaltenbrunner.cc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Florian G. Pflug wrote:
> Teodor Sigaev wrote:
>> CREATE INDEX idxname ON tblname USING gin (textcolumn fulltext_ops);
>>
>> Fulltext_ops opclass parses the document similarly to_tsvector nad
>> stores lexemes in gin index. It's a full equalent of
>> CREATE INDEX ... ( to_tsvector( textcolumn ) )
>>
>> And, let we define operation text @ text, which is equivalent of text
>> @@ plainto_tsquery(text), so, queries will look like
>> SELECT * FROM tblname WHERE textcolumn @ textquery;
>>
>> Fulltext_ops can speedup both operation, text @@ tsquery and text @
>> text. Because gin API has extractQuery method which calls once per
>> index scan and it can parse query to lexemes.
>>
>> Some disadvantage: with that way it isn't possible make fast ranking -
>> there is no stored parsed text. And, fulltext_ops may be done for GiST
>> index too, but fulltext opclass will be lossy which means slow search
>> due to reparse texts for each index match.
> Just a thought:
>
> If the patch that implements the "GENERATED ALWAYS" syntax is accepted,
> than creating a seperate field that hold the parsed text and an index
> on that column becomes as easy as:
> alter table t1 add column text_parsed generated always as
> to_tsvector(text);
> create index idx on t1 using gin (text_parsed fulltext_ops);

or to take tom's idea into consideration:

ALTER TABLE t1 ADD COLUMN text_parsed GENERATED ALWAYS AS to_tsvector(text);
CREATE INDEX idxname ON t1 USING gin (text_parsed);

which looks pretty nice and simple to me

>
> I know that there is a trigger function in tsearch that support something
> similar, but I really like the simplicity of the statements above.
>
> One a related note - will to_tsvector and to_tsquery be renamed to
> something like ft_parse_text() and ft_parse_query() if tsearch2 goes
> into core? It seems like the "ts" part of those names would be the only
> referenced left to the name "tsearch" if they are not, which could be
> somewhat confusing for users.

well either renaming those functions (and completely destroy the upgrade
path for any current users) or just refer to it as "text search" in the
docs (so that the prefix makes sense).

Stefan

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Florian G. Pflug 2007-03-16 16:25:19 Re: tsearch_core for inclusion
Previous Message Oleg Bartunov 2007-03-16 16:16:15 Re: tsearch_core for inclusion