| From: | Luis Felippe <luisfelippe(at)protonmail(dot)com> |
|---|---|
| To: | "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
| Subject: | [PATCH] Fix ICU strength not being honored in collation rules |
| Date: | 2025-10-27 18:33:59 |
| Message-ID: | spHSrzQa0K_-Sqq9rNK-d6kelbfJG-z4XP6vn8tliiCHmjNYy45g2QOD92mrsNYqBpvj8Fi-qw4kXZhZmKjSVevzRSOvh6XzcNZBIV5wA3E=@protonmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hello,
I have run into an issue where specifying the rules argument for "CREATE COLLATION" changes the collation strength to tertiary, even if it is explicitly set in the rules string. I discovered that this is because ucol_openRules is called passing strength UCOL_DEFAULT_STRENGTH, which overwrites whatever is in the rules string with UCOL_TERTIARY.
This fix changes this call to pass UCOL_DEFAULT instead. This way, UCOL_TERTIARY is still specified by default, but the strength explicitly set on the rules string is not overwritten. This is important because there is currently no way to create a collation with custom tailoring rules with strengh other than tertiary.
What happens currently:
CREATE COLLATION my_col (provider = icu, locale = 'und', rules = '', deterministic = false); -- strengh: tertiary
CREATE COLLATION my_col (provider = icu, locale = 'und', rules = '[strength 2]', deterministic = false); -- strength: tertiary
CREATE COLLATION my_col (provider = icu, locale = 'und', rules = '[strength 1]', deterministic = false); -- strength: tertiary
What happens after the patch:
CREATE COLLATION my_col (provider = icu, locale = 'und', rules = '', deterministic = false); -- strengh: tertiary
CREATE COLLATION my_col (provider = icu, locale = 'und', rules = '[strength 2]', deterministic = false); -- strength: secondary
CREATE COLLATION my_col (provider = icu, locale = 'und', rules = '[strength 1]', deterministic = false); -- strength: primary
As this only affects cases where the strength is explicitly set but was previously ignores, I do not think it is a breaking change.
I have successfully compiled and tested PostgreSQL after this change, and it behaves as documented above.
Thank you in advance,
Luis
| Attachment | Content-Type | Size |
|---|---|---|
| 0001-Fix-ICU-strength-not-being-honored-in-collation-rule.patch | text/x-patch | 873 bytes |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Masahiko Sawada | 2025-10-27 19:20:18 | Re: POC: enable logical decoding when wal_level = 'replica' without a server restart |
| Previous Message | Masahiko Sawada | 2025-10-27 18:25:53 | Re: POC: enable logical decoding when wal_level = 'replica' without a server restart |