Re: Simplifying Text Search

From: "Pavel Stehule" <pavel(dot)stehule(at)gmail(dot)com>
To: "Simon Riggs" <simon(at)2ndquadrant(dot)com>
Cc: "Bruce Momjian" <bruce(at)momjian(dot)us>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Simplifying Text Search
Date: 2007-11-13 07:58:03
Message-ID: 162867790711122358v729f8208wc07ed31492306799@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 13/11/2007, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> On Mon, 2007-11-12 at 23:03 -0500, Bruce Momjian wrote:
> > Simon Riggs wrote:
> > > On Mon, 2007-11-12 at 11:56 -0500, Tom Lane wrote:
> > > > Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> > > > > So we end up with a normal sounding function that is overloaded to
> > > > > provide all of the various goodies.
> > > >
> > > > As best I can tell, @@ does exactly this already. This is just a
> > > > different spelling of the same capability, and I don't actually
> > > > find it better. Why is "text_search(x,y)" better than "x @@ y"?
> > > > We don't recommend that people write "texteq(x,y)" instead of
> > > > "x = y".
> > >
> > > Most people don't understand those differences. x = y means "make sure
> > > they are the same" to most people. They don't see what you (and I) see:
> > > function and operator interchangeability. So text_search() is better
> > > than @@ and = is better than texteq(). Life ain't neat...
> > >
> > > Right now, Full Text Search SQL looks like complete gibberish and it
> > > dissuades many people from using what is an awesome set of features. I
> > > just want to add a little sugar to help people get started.
> >
> > I realized this when editing the documentation but not clearly. I
> > noticed that:
> >
> > http://momjian.us/main/writings/pgsql/sgml/textsearch-intro.html#TEXTSEARCH-MATCHING
> >
> > tsvector @@ tsquery
> > tsquery @@ tsvector
> > text @@ tsquery
> > text @@ text
> >
> > The first two of these we saw already. The form text @@ tsquery is
> > equivalent to to_tsvector(x) @@ y. The form text @@ text is equivalent
> > to to_tsvector(x) @@ plainto_tsquery(y).
> >
> > was quite odd, especially the "text @@ text" case, and in fact it makes
> > casting almost required unless you can remember which one is a query and
> > which is a vector (hint, the vector is first). What really adds to the
> > confusion is that the operator is two _identical_ characters, meaning
> > the operator is symetric, and it behave symetric if you cast one side,
> > but as vector @@ query if you don't.
>
> I'm thinking we can have an inlinable function
>
> contains(text, text) returns int
>
> Return values limited to just 0 or 1 or NULL, as with SQL/MM.
> It's close to SQL/MM, but not exact.
>
> contains(sourceText, searchText) is a macro for
>
> case to_tsvector(default_text_search_config, sourceText) @@
> to_tsquery(default_text_search_config, searchText)
> when true then 1
> when false then 0
> else null
> end
>

it's look well.

Pavel

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2007-11-13 08:12:47 Re: Simplifying Text Search
Previous Message Simon Riggs 2007-11-13 06:48:39 Re: Simplifying Text Search