Skip site navigation (1) Skip section navigation (2)

Re: Full Text Indexing Using Tsearch2-Module

From: Michael Fuhr <mike(at)fuhr(dot)org>
To: "Praveen Kumar (TUV)" <praveen(dot)k(at)renaissance-it(dot)com>
Cc: brad(at)kieser(dot)net, pgsql-admin(at)postgresql(dot)org
Subject: Re: Full Text Indexing Using Tsearch2-Module
Date: 2006-01-23 16:44:21
Message-ID: 20060123164421.GA28344@winnie.fuhr.org (view raw or flat)
Thread:
Lists: pgsql-admin
On Mon, Jan 23, 2006 at 03:05:07PM +0530, Praveen Kumar (TUV) wrote:
> I have installed Tsearch-Module for full text indexing .But when
> I search text using gist(idxFTI) index on table I also found all
> data which have same accent.Example
> 1.If I try search for MANI word it also search for MANY word.
> 2.If I try search for ANDY word it also search for ANDI word.
> Please can you tell me how to avoid this problem ? If i want to
> search text MANI it should search only for MANI not MANY.

Your tsearch2 configuration is turning the words "many" and "andy"
into the lexemes "mani" and "andi", like this:

test=> SELECT * FROM ts_debug('many andy');
 ts_name | tok_type | description | token | dict_name | tsvector 
---------+----------+-------------+-------+-----------+----------
 default | lword    | Latin word  | many  | {en_stem} | 'mani'
 default | lword    | Latin word  | andy  | {en_stem} | 'andi'
(2 rows)

To learn how to change that see "Parsing and Lexing" and "Configurations"
in the tsearch2 documentation:

http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/docs/tsearch2-guide.html#parsing_lexing
http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/docs/tsearch2-ref.html#configurations

Another approach is to think of tsearch2 as returning "might match"
rows and add another restriction to find the definite matches:

test=> SELECT * FROM foo
test-> WHERE idxcontent @@ to_tsquery('many');
 id | content | idxcontent 
----+---------+------------
  1 | many    | 'mani':1
  2 | mani    | 'mani':1
(2 rows)

test=> SELECT * FROM foo
test-> WHERE idxcontent @@ to_tsquery('many') AND content ~* 'many';
 id | content | idxcontent 
----+---------+------------
  1 | many    | 'mani':1
(1 row)

-- 
Michael Fuhr

In response to

Responses

pgsql-admin by date

Next:From: Ciprian HodorogeaDate: 2006-01-23 16:49:32
Subject: Re: pg_dump - txt sql vs binary
Previous:From: Michael FuhrDate: 2006-01-23 15:57:29
Subject: Re: pg_dump - txt sql vs binary

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group