From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Przemysław Sztoch <przemyslaw(at)sztoch(dot)pl> |
Cc: | Pg Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: FTS parser - missing UUID token type |
Date: | 2022-09-14 14:10:39 |
Message-ID: | 2673581.1663164639@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
=?UTF-8?Q?Przemys=c5=82aw_Sztoch?= <przemyslaw(at)sztoch(dot)pl> writes:
> I miss UUID, which indexes very strangely, is more and more popular and
> people want to search for it.
Really? UUIDs in running text seem like an extremely uncommon
use-case to me. URLs in running text are common nowadays, which is
why the text search parser has special code for that, but UUIDs?
Adding such a thing isn't cost-free either. Aside from the
probably-substantial development effort, we know from experience
with the URL support that it sometimes misfires and identifies
something as a URL or URL fragment when it really isn't one.
That leads to poorer indexing of the affected text. It seems
likely that adding a UUID token type would be a net negative
for most people, since they'd be subject to that hazard even if
their text contains no true UUIDs.
It's a shame that the text search parser isn't more extensible.
If it were you could imagine having such a feature while making
it optional. I'm not volunteering to fix that though :-(
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Marina Polyakova | 2022-09-14 14:19:34 | Re: ICU for global collation |
Previous Message | Alvaro Herrera | 2022-09-14 13:56:43 | Re: Avoid redudant initialization and possible memory leak (src/backend/parser/parse_relation.c) |