RE: BUG #19045: Applying custom collation rules appears to erase existing rules

From: Todd Lang <Todd(dot)Lang(at)D2L(dot)com>
To: "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: RE: BUG #19045: Applying custom collation rules appears to erase existing rules
Date: 2025-09-11 18:38:29
Message-ID: YT2PPF9592366182EEE0395178755631609BE09A@YT2PPF959236618.CANPRD01.PROD.OUTLOOK.COM
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

My apologies, that should be pg_locale_icu.c, not pg_locale_c

-----Original Message-----
From: Todd Lang <Todd(dot)Lang(at)D2L(dot)com>
Sent: Thursday, September 11, 2025 2:15 PM
To: Todd Lang <Todd(dot)Lang(at)D2L(dot)com>; pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: RE: BUG #19045: Applying custom collation rules appears to erase existing rules

FWIW, I've started putting in some logging to see if I can figure out what's going on here.

What seems to happen is that at backend\utils\adt\pg_locale_c:347 it asks for the existing rules to prepare to append the custom rules. However, I can't seem to track it actually returning any rules. The length returned is always 0. It then dutifully appends the custom rules to this empty set of rules and then applies them, and that is exactly the behaviour I seem to be observing. I'm still trying to figure out why icu_getRules isn't returning the rules for the supplied locale.

-----Original Message-----
From: PG Bug reporting form <noreply(at)postgresql(dot)org>
Sent: Tuesday, September 9, 2025 11:14 AM
To: pgsql-bugs(at)lists(dot)postgresql(dot)org
Cc: Todd Lang <Todd(dot)Lang(at)D2L(dot)com>
Subject: BUG #19045: Applying custom collation rules appears to erase existing rules

CAUTION: This email originated from outside of D2L. Do not respond to, click links or open attachments unless you recognize the sender and know the content is safe.

The following bug has been logged on the website:

Bug reference: 19045
Logged by: Todd Lang
Email address: todd(dot)lang(at)d2l(dot)com
PostgreSQL version: 17.6
Operating system: Windows 10 64 bit
Description:

Setting up a collation on a table with the following:

DROP TABLE IF EXISTS test_table;
DROP COLLATION IF EXISTS CI_AS;
CREATE COLLATION CI_AS (PROVIDER=icu, LOCALE='en-US-u-ks-level2', DETERMINISTIC=false); CREATE TABLE test_table (field1 varchar(256) COLLATE CI_AS); INSERT INTO test_table VALUES (U&'this is a string.'); INSERT INTO test_table VALUES (U&'THIS IS A STRING.');

Then issue the query:
SELECT * FROM test_table WHERE field1 = 'This is a string.';

This should provide:
"this is a string."
"THIS IS A STRING."

Now alter the collation slightly to include rules. (Note the CREATE COLLATION line)

DROP TABLE IF EXISTS test_table;
DROP COLLATION IF EXISTS CI_AS;
CREATE COLLATION CI_AS (PROVIDER=icu, LOCALE='en-US-u-ks-level2', DETERMINISTIC=false, rules=''); CREATE TABLE test_table (field1 varchar(256) COLLATE CI_AS); INSERT INTO test_table VALUES (U&'this is a string.'); INSERT INTO test_table VALUES (U&'THIS IS A STRING.');

Now issue:
SELECT * FROM test_table WHERE field1 = 'This is a string.';

There are no results.

From the documentation it seems that any text supplied should be additional rules to the standard rules.
In `pg_locale_icu.c` in the `make_icu_collator` method at line 455, it seems that it does a simple:

u_strcpy(all_rules, std_rules);
u_strcat(all_rules, my_rules);

which seems like, with the above change, should just append nothing to the standard rules, causing no change. This, however, is not the case.

I have tried it with various permutations of the `rules`, and while any rules supplied during the CREATE COLLATION call appear to function, it seems that all standard rules are forgotten when this option is utilized.

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Tender Wang 2025-09-12 00:57:27 Re: BUG #19046: Incorrect result when using json_array() with column reference in subquery combined with RIGHT JOIN
Previous Message Todd Lang 2025-09-11 18:14:48 RE: BUG #19045: Applying custom collation rules appears to erase existing rules