Re: tsearch comments

From: "eric(at)did-it(dot)com" <eric(at)did-it(dot)com>
To: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
Cc: Uros Gruber <uros(at)sir-mag(dot)com>, pgsql-general(at)postgresql(dot)org, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: tsearch comments
Date: 2003-01-29 14:21:33
Message-ID: 1043850093.14066.32.camel@linuxworks
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

Oleg,

We actually have several somewhat similar tables (A, B, C, D, E...) that
have some textual/varchar content. Thus we make a search table Z that
concatenates the textual info from the first tables. Sure, we could
probably use unions and such the like, but performance reasons prohibit
that scenario :-)

Its much better to search the search table, then show the relevant data
from the source tables based on ranked results.

- Ericson Smith

On Wed, 2003-01-29 at 03:37, Oleg Bartunov wrote:
> On 28 Jan 2003, eric(at)did-it(dot)com wrote:
>
> > Hi,
> >
> > I guess what we're looking for is something on the order (as much as I
> > hate using it as a reference) of MySQL's full text search which does
> > offer some ranking.
> >
> > Just putting ranking alone in tsearch would be a huge benefit. Users can
> > then decide in their own language how to display results, especially
> > since those results may not necessarily require titles or description
> > fragments.
> >
> > For example, we have several huge tables that have the following
> > columns:
> >
> > > id
> > > tbltype
> > > title
> > > description
> >
> > Basically, our customer will lookup words that are contained in title
> > and description, so we make an additional table like:
> >
> > > id
> > > tblid (id of the source table)
> > > tblsource (which table)
> > > content (txtidx)
> >
> > Then we can use tsearch to search the second table (we do now), and once
> > we retrieve the id's that we want, we can display results from one or
> > more source tables. Just putting in ranking in tsearch would solve all
> > these problems.
>
> Hmm, people used to concatenation to get the same result. Do you really
> need that table ? Your problem doesn't relate to ranking of results.
>
> We could add some ranking support based on local (per-document) statistics.
> Keeping global statistics, for example, TFxIDF, would complicate tsearch
> and maintaining of indices. Proximity ranking as in OpenFTS require
> more options in tsearch configuration. Let us think about ranking later
> after we implement friendly interface.
>
> >
> > - Ericson Smith
> > http://www.did-it.com
> > http://www.weightlossfriends.com
> >
> >
> > On Tue, 2003-01-28 at 14:00, Oleg Bartunov wrote:
> > > On Tue, 28 Jan 2003, Uros Gruber wrote:
> > >
> > > > Hi!
> > > >
> > > > I think that this would be nice. OpenFTS is great, but i would
> > > > be great if this would be implement in real pg functions.
> > > >
> > > > I think that indexim would be great if pg make it by itself.
> > > >
> > > > Also it could be great if we could define order of weight of
> > > > columns.
> > >
> > > Could you elaborate this ?
> > >
> > > >
> > > > bye Uros
> > > >
> > > > I
> > > > On 28.01.2003 at 11:53:26, Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
> > > > wrote:
> > > >
> > > > > On Tue, 28 Jan 2003 sector119(at)mail(dot)ru wrote:
> > > > >
> > > > > > HI
> > > > > >
> > > > > > will we see sort by relevance at tsearch alpha version? :)
> > > > > >
> > > > >
> > > > > not sure. We concentrate our efforts, well, Teodor is working
> > > > > on
> > > > > better configurability of tsearch like OpenFTS does.
> > > > >
> > > > > It\\\'s not difficult to add rather naive relevance based on
> > > > > position
> > > > > of lexem in document, for example. The question is do you
> > > > like
> > > > > such
> > > > > kind of relevancy ? Real ranking support (as in OpenFTS)
> > > > > require
> > > > > separate tables to maintain coordinate information.
> > > > > We want to keep tsearch as simple as it\\\'s and now we just
> > > > add
> > > > > better and friendly configurability. Do we need complicate
> > > > > tsearch ?
> > > > > We already have OpenFTS which has most features people
> > > > > requested.
> > > > >
> > > >
> > > >
> > > > ---------------------------(end of broadcast)---------------------------
> > > > TIP 1: subscribe and unsubscribe commands go to majordomo(at)postgresql(dot)org
> > > >
> > >
> > > Regards,
> > > Oleg
> > > _____________________________________________________________
> > > Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
> > > Sternberg Astronomical Institute, Moscow University (Russia)
> > > Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
> > > phone: +007(095)939-16-83, +007(095)939-23-83
> > >
> > >
> > > ---------------------------(end of broadcast)---------------------------
> > > TIP 3: if posting/reading through Usenet, please send an appropriate
> > > subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
> > > message can get through to the mailing list cleanly
> >
> >
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 1: subscribe and unsubscribe commands go to majordomo(at)postgresql(dot)org
> >
>
> Regards,
> Oleg
> _____________________________________________________________
> Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
> Sternberg Astronomical Institute, Moscow University (Russia)
> Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
> phone: +007(095)939-16-83, +007(095)939-23-83
>

In response to

Browse pgsql-general by date

  From Date Subject
Next Message SZŰCS Gábor 2003-01-29 14:33:26 numeric usernames in 7.3.1?
Previous Message Greg Sabino Mullane 2003-01-29 14:19:48 Website troubles

Browse pgsql-hackers by date

  From Date Subject
Next Message Francisco J Reyes 2003-01-29 15:20:07 Re: list server problems?
Previous Message Reggie Burnett 2003-01-29 14:14:15 Re: Request for qualified column names