| From: | Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com> |
|---|---|
| To: | Jeff Davis <pgsql(at)j-davis(dot)com> |
| Cc: | Daniel Verite <daniel(at)manitou-mail(dot)org>, pgsql-hackers(at)postgresql(dot)org |
| Subject: | Re: Use CASEFOLD() internally rather than LOWER() |
| Date: | 2026-03-25 14:40:23 |
| Message-ID: | CAHgHdKtb2jD+DaTJU+3jnQRZ9hEXSDcPCR8DCCzZTTVeo4jQcA@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Tue, Mar 24, 2026 at 4:07 PM Jeff Davis <pgsql(at)j-davis(dot)com> wrote:
> On Sat, 2026-03-21 at 20:14 -0700, Mark Dilger wrote:
> > After v2-0001, ILIKE uses str_casefold() for matching, but pg_trgm
> > still
> > uses str_tolower() for trigram extraction (trgm_op.c:352 and :948).
> > With builtin collations, these produce different results.
>
> Interesting, thank you. As stated in the original message, I was unsure
> about changing pg_trgm without adjusting the regex logic, also:
>
>
> https://www.postgresql.org/message-id/64d7949bad90545f981ac7513fb0b4954daca2c9.camel@j-davis.com
>
> do you have a suggestion about an easy way to do that, or should we
> revisit in the next cycle?
>
pg_trgm appears to be lossy, with recheck logic. I would think you just
need to make it give answers which at least include everything that a regex
would match, and then allow recheck to prune that down. My concern is
having pg_trgm give less than all the answers, so that after recheck you
get fewer results than a seqscan would have returned. Would switching to
casefold be strictly broader than regex? If so, you would just need to
convert pg_trgm to use casefold and then rely on the recheck machinery.
Sorry if this misses something discussed upthread. I'm clearly assuming
here that you don't mind that such a change necessitates a REINDEX.
--
*Mark Dilger*
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Matthias van de Meent | 2026-03-25 14:46:08 | Re: SQL-level pg_datum_image_equal |
| Previous Message | Tomas Vondra | 2026-03-25 14:38:01 | Re: Test timings are increasing too fast for cfbot |