Collation DDL inconsistencies

From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Collation DDL inconsistencies
Date: 2022-12-07 00:33:38
Message-ID: feccba69685ce9dd9f3eac181631a45120d1ad8e.camel@j-davis.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


When I looked at the bug:

https://postgr.es/m/CALDQics_oBEYfOnu_zH6yw9WR1waPCmcrqxQ8+39hK3Op=z2UQ@mail.gmail.com

I noticed that the DDL around collations is inconsistent. For instance,
CREATE COLLATION[1] uses LOCALE, LC_COLLATE, and LC_CTYPE parameters to
specify either libc locales or an icu locale; whereas CREATE
DATABASE[2] uses LOCALE, LC_COLLATE, and LC_CTYPE always for libc, and
ICU_LOCALE if the default collation is ICU.

The catalog representation is strange in a different way:
datcollate/collcollate are always for libc, and daticulocale is for
icu. That means anything that deals with those fields needs to pick the
right one based on the provider.

If this were a clean slate, it would make more sense if it were
something like:

datcollate/collcollate: to instantiate pg_locale_t
datctype/collctype: to instantiate pg_locale_t
datlibccollate: used by libc elsewhere
datlibcctype: used by libc elsewhere
daticulocale/colliculocale: remove these fields

That way, if you are instantiating a pg_locale_t, you always just pass
datcollate/datctype/collcollate/collctype, regardless of the provider
(pg_newlocale_from_collation() would figure it out). And if you are
going to do something straight with libc, you always use
datlibccollate/datlibcctype.

Aside: why don't we support different collate/ctype with ICU? It
appears that u_strToTitle/u_strToUpper/u_strToLower just accept a
string "locale", and it would be easy enough to pass it whatever is in
datctype/collctype, right? We should validate that it's a valid locale;
but other than that, I don't see the problem.

Thoughts? Implementation-wise, I suppose this could create some
annoyances in pg_dump.

[1] https://www.postgresql.org/docs/devel/sql-createcollation.html
[2] https://www.postgresql.org/docs/devel/sql-createdatabase.html
[3] https://unicode-org.github.io/icu-docs/apidoc/released/icu4c/ustring_8h.html

--
Jeff Davis
PostgreSQL Contributor Team - AWS

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2022-12-07 00:57:22 Re: Query Jumbling for CALL and SET utility statements
Previous Message Michael Paquier 2022-12-07 00:16:14 Re: Generate pg_stat_get_* functions with Macros