Re: BUG #16499: Escape Characters in FTS

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: chirag(dot)gupta1(at)globallogic(dot)com
Cc: pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #16499: Escape Characters in FTS
Date: 2020-06-18 15:49:40
Message-ID: 1551296.1592495380@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

PG Bug reporting form <noreply(at)postgresql(dot)org> writes:
> We are using FTS to implement search and there is a scenario where we have a
> word say "IVR(at)#1", when I search for "1" there should not be any result as
> per user perspective.
> It seems special characters are replaced by spaces/blanks. Is there any way
> to include Special characters in search? Kindly let us know through this
> channel as earliest possible.

You would need to implement your own text search parser that classifies
"IVR(at)#1" as a single token. That's certainly do-able, but it's not
exactly trivial. The built-in parser doesn't have any ability to be
reconfigured to apply different tokenization rules: it just does
what's described at

https://www.postgresql.org/docs/current/textsearch-parsers.html

On the bright side, it sounds like you might not care too much
about URLs or hyphenated words, in which case your custom parser
could be far simpler than the built-in one. There is a skeleton
parser in our source tree at src/test/modules/test_parser/ that
might help you get started.

regards, tom lane

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Stephen Frost 2020-06-18 15:55:37 Re: BUG #16497: old and new pg_controldata WAL segment sizes are invalid or do not match
Previous Message Bruce Momjian 2020-06-18 15:48:54 Re: BUG #16497: old and new pg_controldata WAL segment sizes are invalid or do not match