Quick Links

Re: FTS parser - missing UUID token type

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Przemysław Sztoch <przemyslaw(at)sztoch(dot)pl>
Cc:	Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: FTS parser - missing UUID token type
Date:	2022-09-14 14:10:39
Message-ID:	2673581.1663164639@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

=?UTF-8?Q?Przemys=c5=82aw_Sztoch?= <przemyslaw(at)sztoch(dot)pl> writes:
> I miss UUID, which indexes very strangely, is more and more popular and
> people want to search for it.

Really? UUIDs in running text seem like an extremely uncommon
use-case to me. URLs in running text are common nowadays, which is
why the text search parser has special code for that, but UUIDs?

Adding such a thing isn't cost-free either. Aside from the
probably-substantial development effort, we know from experience
with the URL support that it sometimes misfires and identifies
something as a URL or URL fragment when it really isn't one.
That leads to poorer indexing of the affected text. It seems
likely that adding a UUID token type would be a net negative
for most people, since they'd be subject to that hazard even if
their text contains no true UUIDs.

It's a shame that the text search parser isn't more extensible.
If it were you could imagine having such a feature while making
it optional. I'm not volunteering to fix that though :-(

regards, tom lane

In response to

FTS parser - missing UUID token type at 2022-09-14 09:26:41 from Przemysław Sztoch

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Marina Polyakova	2022-09-14 14:19:34	Re: ICU for global collation
Previous Message	Alvaro Herrera	2022-09-14 13:56:43	Re: Avoid redudant initialization and possible memory leak (src/backend/parser/parse_relation.c)