From: | Todd Lang <Todd(dot)Lang(at)D2L(dot)com> |
---|---|
To: | "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Supporting non-deterministic collations with tailoring rules. |
Date: | 2025-09-23 14:51:28 |
Message-ID: | YT2PPF959236618377A072745A280E278F4BE1DA@YT2PPF959236618.CANPRD01.PROD.OUTLOOK.COM |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Reposting this here from the Discord server as requested:
When creating a collation, in https://github.com/postgres/postgres/blob/master/src/backend/utils/adt/pg_locale_icu.c#L461 it is opening the collator with the tailoring rules supplied. However, it has hardcoded the strength level UCOL_DEFAULT_STRENGTH. This has the effect of ignoring the "deterministic=false" you may have specified in your CREATE COLLATION call. If, instead of UCOL_DEFAULT_STRENGTH, the code understood the deterministic parameter and passed either UCOL_PRIMARY for "deterministic=true", and UCOL_SECONDARY for "deterministic=false", this would preserve the attempt to obtain case-insensitivity in the locale while simultaneously allowing tailoring as expected.
I have made the modification to the pg_locale_icu.c and tested it locally (simply hardcoding UCOL_SECONDARY - not checking the deterministic parameter) and it behaves as expected, though I freely admit my knowledge of ICU intersecting with Postgres is rather limited.
From | Date | Subject | |
---|---|---|---|
Next Message | 蔡梦娟 (玊于) | 2025-09-23 15:08:12 | Re: Newly created replication slot may be invalidated by checkpoint |
Previous Message | Tom Lane | 2025-09-23 14:39:02 | Re: Inconsistent Behavior of GROUP BY ROLLUP in v17 vs master |