Re: Another tsearch bug...

From: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
To: Christopher Kings-Lynne <chriskl(at)familyhealth(dot)com(dot)au>
Cc: Teodor Sigaev <teodor(at)stack(dot)net>, "Ross J(dot) Reedstrom" <reedstrm(at)rice(dot)edu>, Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Another tsearch bug...
Date: 2002-08-23 12:29:25
Message-ID: Pine.GSO.4.44.0208231520120.15230-100000@ra.sai.msu.su
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, 23 Aug 2002, Christopher Kings-Lynne wrote:

> Hi guys,
>
> Hate to keep coming up with these bugs without patches - but I really don't
> have time to look into the source code atm :(
>
> OK, attached is an example of the problem. Notice how trademarks and
> copyright symbols are being indexed along with the word. This means that if
> someone searches for 'balance' in the above data set, they won't find
> anything.
>
> I'm not sure how this would be handled. In the English language, it'd
> probably be safe to say that high ascii characters would be stripped from
> the index? But you'd want to leave accents and stuff in I guess. Tricky.

Rather tricky. The problem is that we don't know how to get flex to works
with locale. Parser recognizes latin words ([a-zA-Z]), nonLatin ([\0200-\0377])
and mixed words ([a-zA-Z\0200-\0377]). Your case (balanceR) is the mixed word.
The right way is to have locale aware parser to properly recognize words.
We incline to refuse a flex.

>
> Anyway, just bringing it to your attention...
>
> Chris
>

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Copeland 2002-08-23 12:37:06 Re: recent security activity
Previous Message Vince Vielhaber 2002-08-23 09:45:52 Re: My head is spinning