Re: FTS uses "tsquery" directly in the query

From: xu fei <autofei(at)yahoo(dot)com>
To: Ivan Sergio Borgonovo <mail(at)webthatworks(dot)it>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: FTS uses "tsquery" directly in the query
Date: 2010-01-25 23:30:01
Message-ID: 362774.80233.qm@web45404.mail.sp1.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi, Ivan:
I agree with you and also would like to 'hack' into the code. Current FTC is the best one in database system and a great building block to support more functions. I list some I can think about:choose "|" or "&" as an optional parameter for to_tsquery, to_tsvector.choose normalization or not for to_tsquery, to_tsvector.current two rankings are not enough: the default ts_rank, I have not figured out the algorithm. The ts_rank_cd, we have the paper but it is designed for short query with 2 or 3 tokens.The normalization may be similar to Apache Lucene which is really easy to modify and build your own tokenizer. I still feel confused after reading the annual.. I am not sure current there is a team to help Oleg Bartunov or not. If need, I can try to do something rather than just hacking it. I am sure, Ivan also will join this. :)Xu
--- On Mon, 1/25/10, Ivan Sergio Borgonovo <mail(at)webthatworks(dot)it> wrote:

From: Ivan Sergio Borgonovo <mail(at)webthatworks(dot)it>
Subject: Re: [GENERAL] FTS uses "tsquery" directly in the query
To: pgsql-general(at)postgresql(dot)org
Date: Monday, January 25, 2010, 4:33 PM

On Mon, 25 Jan 2010 23:35:12 +0300 (MSK)
Oleg Bartunov <oleg(at)sai(dot)msu(dot)su> wrote:

> Do you guys wanted something like:
>
> arxiv=# select and2or(to_tsquery('1 & 2 & 3'));
>         and2or
> ---------------------
>   ( '1' | '2' ) | '3'
> (1 row)

Nearly. I'm starting from a weighted tsvector not from text/tsquery.
I would like to:
- keep the weights in the query
- avoid parsing the text to extract lexemes twice (I already have a
  tsvector)

For me extending pg in C is a new science, but I'm actually trying
to write at least a couple of functions that:
- will return a tsvector as a weight int, pos int[], lexeme text
  record
- will turn a tsvector + operator into a tsquery
  'orange':A1,2,3 'banana':B4,5 'tomato':C6,7 ->
  'orange':A | 'banana':B | 'tomato':C
  or eventually
  'orange':A & 'banana':B & 'tomato':C

thanks

--
Ivan Sergio Borgonovo
http://www.webthatworks.it

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Alvaro Herrera 2010-01-25 23:33:26 Re: Log full of: statement_timeout out of the valid range.
Previous Message Jeff Davis 2010-01-25 22:24:48 Re: revoke from all users