Re: fts, compond words?

From: Mike Rylander <mrylander(at)gmail(dot)com>
To: Teodor Sigaev <teodor(at)sigaev(dot)ru>, POSTGRESQL <pgsql-general(at)postgresql(dot)org>
Subject: Re: fts, compond words?
Date: 2005-12-07 18:20:32
Message-ID: b918cf3d0512071020n2877e80kfebdc9f7533ed956@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 12/7/05, Teodor Sigaev <teodor(at)sigaev(dot)ru> wrote:
> That is a long discussed thing. We can't formulate unconflicting rules... For
> example:
> 1) a &[dist<=2] ( b &[dist<=3] c )
> 2) a &[dist<=2] ( b |[dist<=3] c )
> 3) a &[dist<=2] !c
> 4) a &[dist<=2] ( b |[dist<=3] !c )
> 5) a &[dist<=2] ( b & c )
> What does exact they mean? What is tsvectors which should be matched by those
> queries?

1,2,4, and 5 are obviously ambiguous, but 3 seems straightforward to
me, if not more difficult to implement. Would it not be acceptable to
say that proximity modifiers are only valid between two simple lexemes
and can not be placed next to any compound expression?

>
> The simple solution is : under operation 'phrase search' (ok, it will be '+'
> below) it must be only 'phrase search operations. I.e.:
> a | b ( c + ( d + e ) ) - good
> a | ( c + ( d & g ) ) - bad.
>

Same as above. And, while '+' would be a very good shortcut for
"&[follows;dist=1]" (or some such), I think the user should be able to
specify the proximity more explicitly as well.

> For example, we have word 'foonish' and after lexize we got two lexemes: 'foo1'
> and 'foo2'. So a good query 'a + foonish' becomes 'a + ( foo1 | foo2 )'...
>

hrm... that is a problem. Though, I think that's a case of how the
compiled expression is built from user input. Unless I'm mistaken

a + ( foo1 | foo2 )

is exactly equal to

(a + foo1) | (a + foo2)

Ahhh... but then there is the more complex example of

a + foonish + bar

becoming

a + (foo1 | foo2) + bar

.... but I guess that could be

(a + foo1 + bar) | (a + foo2 + bar)

>
>
>
>
> Mike Rylander wrote:
> > On 12/6/05, Marcus Engene <mengpg(at)engene(dot)se> wrote:
> >
> > [snip]
> >
> >
> >> A & (B | (New OperatorTheNextWordMustFollow York))
> >>
> >
> >
> > Actually, I love that idea. Oleg, would it be possible to create a
> > tsquery operator that understands proximity? Or, how allowing a
> > predicate to the current '&' op, as in '&[dist<=1]' meaning "next
> > token follows with a max distance of 1". I imagine that it would
> > only be useful on unstripped tsvectors, but if the lexem position is
> > already stored ...
> >
> > --
> > Mike Rylander
> > mrylander(at)gmail(dot)com
> > GPLS -- PINES Development
> > Database Developer
> > http://open-ils.org
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 2: Don't 'kill -9' the postmaster
>
> --
> Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
> WWW: http://www.sigaev.ru/
>

--
Mike Rylander
mrylander(at)gmail(dot)com
GPLS -- PINES Development
Database Developer
http://open-ils.org

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Jaime Casanova 2005-12-07 18:31:23 Re: Letting a function return multiple columns instead of a single complex one
Previous Message Richard Huxton 2005-12-07 18:16:43 Re: FW: Advanced search form