Skip site navigation (1) Skip section navigation (2)

Re: phrase search

From: Teodor Sigaev <teodor(at)sigaev(dot)ru>
To: sushant354(at)gmail(dot)com
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: phrase search
Date: 2008-06-02 15:39:00
Message-ID: 48441414.6060802@sigaev.ru (view raw or flat)
Thread:
Lists: pgsql-hackers

> I have attached a patch for phrase search with respect to the cvs head.
> Basically it takes a a phrase (text) and a TSVector. It checks if the
> relative positions of lexeme in the phrase are same as in their
> positions in TSVector.

Ideally, phrase search should be implemented as new operator in tsquery, say # 
with optional distance. So, tsquery 'foo #2 bar' means: find all texts where 
'bar' is place no far than two word from 'foo'. The complexity is about complex 
boolean expressions ( 'foo #1 ( bar1 & bar2 )' ) and about several languages as 
norwegian or german. German language has combining words, like a footboolbar  - 
  and they have several variants of splitting, so result of to_tsquery('foo # 
footboolbar') will be a 'foo # ( ( football & bar ) | ( foot & ball & bar ) )'
where variants are connected with OR operation.

Of course, phrase search should be able to use indexes.
> 
> If the configuration for text search is "simple", then this will produce
> exact phrase search. Otherwise the stopwords in a phrase will be ignored
> and the words in a phrase will only be matched with the stemmed lexeme.

Your solution can't be used as is, because user should use tsquery too to use an 
index:

column @@ to_tsquery('phrase search') AND  is_phrase_present('phrase search', 
column)

First clause will be used for index scan and it will fast search a candidates.

> For my application I am using this as a separate shared object. I do not
> know how to expose this function from the core. Can someone explain how
> to do this?

Look at contrib/ directory in pgsql's source code - make a contrib module from 
your patch. As an example, look at adminpack module - it's rather simple.

Comments of your code:
1)
+#ifdef PG_MODULE_MAGIC
+PG_MODULE_MAGIC;
+#endif

That isn't needed for compiled-in in core files, it's only needed for modules.

2)
  use only /**/ comments, do not use a // (C++ style) comments
-- 
Teodor Sigaev                                   E-mail: teodor(at)sigaev(dot)ru
                                                    WWW: http://www.sigaev.ru/

In response to

Responses

pgsql-hackers by date

Next:From: David E. WheelerDate: 2008-06-02 15:39:53
Subject: Re: Case-Insensitve Text Comparison
Previous:From: Stephen R. van den BergDate: 2008-06-02 15:07:15
Subject: Re: Overhauling GUCS

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group