Re: Tsearch limitations

From: Mike Benoit <mikeb(at)netnation(dot)com>
To: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
Cc: pgsql general list <pgsql-general(at)postgresql(dot)org>
Subject: Re: Tsearch limitations
Date: 2003-08-11 17:28:19
Message-ID: 1060622898.25396.33.camel@mikeb.staff.netnation.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Oleg,

Is it possible to have Tsearch support soundex, or levenshtein
(http://ca3.php.net/manual/en/function.levenshtein.php) when searching?

I've never used Tsearch before, but I assume this might just be a matter
of writing a different parser to add soundex'd versions of words to the
index, then modify the query functions to search on both versions of the
word?

On Mon, 2003-08-11 at 07:30, Oleg Bartunov wrote:
> On Mon, 11 Aug 2003 psql-mail(at)freeuk(dot)com wrote:
>
> > Oleg,
> >
> > I understand (i think) how the parser breaks up the input into words
> > and builds ts_vector's.
> >
> > And i understand how to do queries as described into the documentation.
> > (I have read it!)
> >
> > SELECT * FROM vectors WHERE vector @@ to_tsquery('(leads|forks) & !
> > crawl')
> >
> > But i haven't seen any mention of if i add the word:
> >
> > cathedral
> >
> > if there is any query which will match if I search for "thed".
>
> No, tsearch2 is a word oriented search. It doesn't supports substring
> search.
>
> >
> > The documentation seems to say that this cannot be done - but i'd just
> > like to check. Tsearch2 does everything i want except this.
> >
> > "remember that the search operator @@ finds only exact matches between
> > query lexemes and vector lexemes ≈ if they are not exactly the same
> > string, they will not be considered a match"
> >
> >
> > > Mat,
> > >
> > > there are several function you may use to see (please, read
> > documentation):
> > >
> > > apod=# select to_tsvector('Hi my email addres is psql-mail(at)freeuk(dot)com'
> > );
> > > to_tsvector
> > > ----------------------------------------------------
> > > 'hi':1 'addr':4 'email':3 'psql-mail(at)freeuk(dot)com':6
> > > (1 row)
> > >
> > > or, even better
> > >
> > > apod=# select * from ts_debug('Hi my email addres is psql-mail(at)freeuk(dot)
> > com');
> > > ts_name | tok_type | description | token |
> > dict_name | tsvector
> > > -----------------+----------+-------------+----------------------+----
> > -------+------------------------
> > > default_russian | lword | Latin word | Hi | {
> > en_stem} | 'hi'
> > > default_russian | lword | Latin word | my | {
> > en_stem} |
> > > default_russian | lword | Latin word | email | {
> > en_stem} | 'email'
> > > default_russian | lword | Latin word | addres | {
> > en_stem} | 'addr'
> > > default_russian | lword | Latin word | is | {
> > en_stem} |
> > > default_russian | email | Email | psql-mail(at)freeuk(dot)com | {
> > simple} | 'psql-mail(at)freeuk(dot)com'
> > > (6 rows)
> > >
> > > You may write your own parser or preprocess text before tsearch.
> > >
> > > Oleg
> > > On Mon, 11 Aug 2003, Mat wrote:
> > >
> > > > Can Tsearch be used to return substring matches?
> > > >
> > > > i.e
> > > >
> > > > Text to search: Hi my email addres is psql-mail(at)freeuk(dot)com
> > > >
> > > > Query "psql" would match the email address?
> > > >
> > > > Query "mail" would also match?
> > > >
> > > > Query "reeu" would also match?
> > > >
> > > > Or is tsearch not suitable for this type of query? should i use FTI
> >
> > > > instead?
> > > >
> > > > Thanks.
> > > >
> > > >
> > > > ---------------------------(end of broadcast)-----------------------
> > ----
> > > > TIP 6: Have you searched our list archives?
> > > >
> > > > http://archives.postgresql.org
> > > >
> > >
> > > Regards,
> > > Oleg
> > > _____________________________________________________________
> > > Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
> > > Sternberg Astronomical Institute, Moscow University (Russia)
> > > Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
> > > phone: +007(095)939-16-83, +007(095)939-23-83
> > >
> >
> >
>
> Regards,
> Oleg
> _____________________________________________________________
> Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
> Sternberg Astronomical Institute, Moscow University (Russia)
> Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
> phone: +007(095)939-16-83, +007(095)939-23-83
>
> ---------------------------(end of broadcast)---------------------------
> TIP 8: explain analyze is your friend
--
Best Regards,

Mike Benoit
NetNation Communications Inc.
Systems Engineer
Tel: 604-684-6892 or 888-983-6600
---------------------------------------

Disclaimer: Opinions expressed here are my own and not
necessarily those of my employer

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Roderick A. Anderson 2003-08-11 19:04:47 Update of foreign key values
Previous Message perltastic 2003-08-11 17:22:43 Re: How to recognize PG SQL files?