Supporting non-deterministic collations with tailoring rules.

From: Todd Lang <Todd(dot)Lang(at)D2L(dot)com>
To: "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Supporting non-deterministic collations with tailoring rules.
Date: 2025-09-23 14:51:28
Message-ID: YT2PPF959236618377A072745A280E278F4BE1DA@YT2PPF959236618.CANPRD01.PROD.OUTLOOK.COM
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Reposting this here from the Discord server as requested:

When creating a collation, in https://github.com/postgres/postgres/blob/master/src/backend/utils/adt/pg_locale_icu.c#L461 it is opening the collator with the tailoring rules supplied. However, it has hardcoded the strength level UCOL_DEFAULT_STRENGTH. This has the effect of ignoring the "deterministic=false" you may have specified in your CREATE COLLATION call. If, instead of UCOL_DEFAULT_STRENGTH, the code understood the deterministic parameter and passed either UCOL_PRIMARY for "deterministic=true", and UCOL_SECONDARY for "deterministic=false", this would preserve the attempt to obtain case-insensitivity in the locale while simultaneously allowing tailoring as expected.

I have made the modification to the pg_locale_icu.c and tested it locally (simply hardcoding UCOL_SECONDARY - not checking the deterministic parameter) and it behaves as expected, though I freely admit my knowledge of ICU intersecting with Postgres is rather limited.

Browse pgsql-hackers by date

  From Date Subject
Next Message 蔡梦娟 (玊于) 2025-09-23 15:08:12 Re: Newly created replication slot may be invalidated by checkpoint
Previous Message Tom Lane 2025-09-23 14:39:02 Re: Inconsistent Behavior of GROUP BY ROLLUP in v17 vs master