Re: Comparing tsearch2 vectors.

From: Achilleus Mantzios <achill(at)matrix(dot)gatewaynet(dot)com>
To: Rajesh Kumar Mallah <mallah(at)trade-india(dot)com>
Cc: pgsql-sql(at)postgresql(dot)org
Subject: Re: Comparing tsearch2 vectors.
Date: 2004-07-13 06:36:46
Message-ID: Pine.LNX.4.44.0407130929120.5904-100000@matrix.gatewaynet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

O kyrios Rajesh Kumar Mallah egrapse stis Jul 13, 2004 :

> Achilleus Mantzios wrote:
>
> >O kyrios Rajesh Kumar Mallah egrapse stis Jul 12, 2004 :
> >
> >
> >
> >>Achilleus Mantzios wrote:
> >>
> >>
> >>
> >>>O kyrios Rajesh Kumar Mallah egrapse stis Jul 12, 2004 :
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>>Dear Mantzios,
> >>>>
> >>>>I have to get set of banners from database in
> >>>>response to a search term. I want that the search term
> >>>>be compared to the keyword corresponding to the
> >>>>banners stored in database. current i am doing an
> >>>>equality match but i woild like to do it after stemming
> >>>>both the sides (serch term and keywords).
> >>>>
> >>>>
> >>>>
> >>>>
> >>>You could transform your search terms so that there is the "&"
> >>>separator between them. (& stands for "AND").
> >>>E.g. "handicrafts exporter" becomes "handicrafts&exporter"
> >>>And then
> >>>select * from <your table> where idxfti @@ to_tsquery(<searchterms>);
> >>>
> >>>
> >>>
> >>>
> >>But i do not want 'handicraft exporters of delhi' to pop out if i search
> >>for 'handicrafts exporters' whereas
> >>
> >>SELECT to_tsvector('handycrafts exporters of delhi') @@ to_tsquery('handycraft&exporting');
> >>
> >>will be true.
> >>
> >>
> >
> >Define what you want, and then read tsearch2 userguide.
> >I'm sure you'll find your way :)
> >
> >
> The requirement is different than full text search.
> I am not searching a word in a collection of words (text)
> rather comparing two strings after all the words in those
> strings are stemmed. Hope my requirement is clear now.

Ok, so we drop back to the initial assumption.
Tokenize both strings into an array of strings.
Let them be String[] string1,String[] string2
If arrays are not of same length then they are not equal.
Otherwise for each i in string1 compare
lexize(<your stem dict>,string1[i]) against
lexize(<your stem dict>,string2[i])

The tokenization is your job, while the lexize function comes with
tsearch2.

I dont know if its possible to be done in sql, since it requires some sort
of iteration.

>
>
> Regds
> mallah.
>
>
>
>
> >
> >
> >>Regds
> >>Mallah.
> >>
> >>
> >>
> >>
> >>
> >>>where idxfti is your tsvector column.
> >>>
> >>>E.g.
> >>># SELECT to_tsvector('handycrafts exporters') @@ to_tsquery('handycraft&exporting');
> >>>?column?
> >>>----------
> >>>t
> >>>(1 row)
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>>So that the banners for the adword say 'incense exporter' is
> >>>>shown even if 'incenses exporter' or 'incense exporters' is
> >>>>searched.
> >>>>
> >>>>I hope i am able to clarify.
> >>>>
> >>>>Regds
> >>>>Mallah.
> >>>>
> >>>>Achilleus Mantzios wrote:
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>>O kyrios Rajesh Kumar Mallah egrapse stis Jul 12, 2004 :
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>>Hi,
> >>>>>>
> >>>>>>We want to compare strings after stemming. Can anyone
> >>>>>>tell me what is the best method. I was thinking to compare
> >>>>>>the tsvector ,but there is no operator for that.
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>I'd tokenize each string and then apply lexize() to get the
> >>>>>equivalent stemified
> >>>>>word, but what exactly are you trying to accomplish?
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>>Regds
> >>>>>>Mallah.
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>tradein_clients=# SELECT to_tsvector('handicraft exporters');
> >>>>>>+---------------------------+
> >>>>>>| to_tsvector |
> >>>>>>+---------------------------+
> >>>>>>| 'export':2 'handicraft':1 |
> >>>>>>+---------------------------+
> >>>>>>(1 row)
> >>>>>>
> >>>>>>Time: 710.315 ms
> >>>>>>tradein_clients=#
> >>>>>>tradein_clients=# SELECT to_tsvector('handicrafts exporter');
> >>>>>>+---------------------------+
> >>>>>>| to_tsvector |
> >>>>>>+---------------------------+
> >>>>>>| 'export':2 'handicraft':1 |
> >>>>>>+---------------------------+
> >>>>>>(1 row)
> >>>>>>
> >>>>>>Time: 400.679 ms
> >>>>>>tradein_clients=# SELECT to_tsvector('Hi there') = to_tsvector('Hi there');
> >>>>>>ERROR: operator does not exist: tsvector = tsvector
> >>>>>>HINT: No operator matches the given name and argument type(s). You may
> >>>>>>need to add explicit type casts.
> >>>>>>tradein_clients=#
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>
> >>>
> >>>
> >>>
> >>
> >>
> >>
> >
> >
> >
>
>
>

--
-Achilleus

In response to

Browse pgsql-sql by date

  From Date Subject
Next Message Daniel Struck 2004-07-13 09:35:57 Re: [PHP] Secure DB Systems - How to
Previous Message Rajesh Kumar Mallah 2004-07-13 03:31:53 Re: Comparing tsearch2 vectors.