Re: Simplifying Text Search

From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Trevor Talbot <quension(at)gmail(dot)com>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Gregory Stark <stark(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org, Martijn van Oosterhout <kleptog(at)svana(dot)org>, Aidan Van Dyk <aidan(at)highrise(dot)ca>, Heikki Linnakangas <heikki(at)enterprisedb(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Simplifying Text Search
Date: 2007-11-14 02:58:27
Message-ID: 473A6453.10905@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Trevor Talbot wrote:
> On 11/13/07, Peter Eisentraut <peter_e(at)gmx(dot)net> wrote:
>
>> Am Dienstag, 13. November 2007 schrieb Gregory Stark:
>>
>>> "Peter Eisentraut" <peter_e(at)gmx(dot)net> writes:
>>>
>
>
>>>> What we'd need is a way to convert a LIKE pattern into a tsquery
>>>> ('%foo%bar%' => 'foo & bar'). Then you might even be able to sneak
>>>> index-optimized text search into existing applications. Might be worth a
>>>> try.
>>>>
>
>
>>> I don't think that's the right direction to go. Notably "%foo%bar%" isn't
>>> the same thing as "foo & bar". Also most tsearch queries can't be expressed
>>> as LIKE patterns anyways.
>>>
>
>
>> The requirement is to express LIKE patterns as tsearch queries, not the other
>> way around.
>>
>
> How? LIKE queries are incapable of expressing word boundaries, do not
> support substitution, and are implicitly ordered. tsearch queries
> operate entirely on word boundaries, may substitute words, and are
> unordered.
>
> I don't see the two as even working in the same space, let alone be
> convertable for optimization purposes. If the idea was just to use a
> tsearch index as an initial filter, then running LIKE on the results,
> dictionary-based substitution makes that unreliable.
>
>
>

The fact that we are having this discussion at all demonstrates to me
that we should leave well alone - any use of LIKE in this context is
just about guaranteed to cause massive confusion. (Not to mention that
it's far too late in the dev cycle to be making such changes, if we're
thinking of them for 8.3).

cheers

andrew

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Sabino Mullane 2007-11-14 02:58:42 Re: [HACKERS] plperl and regexps with accented characters - incompatible?
Previous Message Jan Urbański 2007-11-14 02:48:02 Re: a tsearch2 (8.2.4) dictionary that only filters out stopwords