Another tsearch bug...

From: "Christopher Kings-Lynne" <chriskl(at)familyhealth(dot)com(dot)au>
To: "Teodor Sigaev" <teodor(at)stack(dot)net>
Cc: "Ross J(dot) Reedstrom" <reedstrm(at)rice(dot)edu>, "Oleg Bartunov" <oleg(at)sai(dot)msu(dot)su>, "Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Another tsearch bug...
Date: 2002-08-23 04:25:19
Message-ID: GNELIHDDFBOCMGBFGEFOIENKCDAA.chriskl@familyhealth.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi guys,

Hate to keep coming up with these bugs without patches - but I really don't
have time to look into the source code atm :(

OK, attached is an example of the problem. Notice how trademarks and
copyright symbols are being indexed along with the word. This means that if
someone searches for 'balance' in the above data set, they won't find
anything.

I'm not sure how this would be handled. In the English language, it'd
probably be safe to say that high ascii characters would be stripped from
the index? But you'd want to leave accents and stuff in I guess. Tricky.

Anyway, just bringing it to your attention...

Chris

Attachment Content-Type Size
example.txt text/plain 2.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Christopher Kings-Lynne 2002-08-23 04:26:52 Re: Proposed GUC Variable
Previous Message Bruce Momjian 2002-08-23 04:24:59 Re: Proposed GUC Variable