From: | Jeff Davis <pgsql(at)j-davis(dot)com> |
---|---|
To: | Daniel Verite <daniel(at)manitou-mail(dot)org> |
Cc: | Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Sandro Santilli <strk(at)kbt(dot)io>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Regina Obe <lr(at)pcorp(dot)us>, pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: Order changes in PG16 since ICU introduction |
Date: | 2023-04-28 21:35:25 |
Message-ID: | 654a49f7ff7461bcf47be4181430678d45f93858.camel@j-davis.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, 2023-04-27 at 14:23 +0200, Daniel Verite wrote:
> This should be pg_strcasecmp(...) == 0
Good catch, thank you! Fixed in updated patches.
> postgres=# create database lat9 locale 'fr_FR(at)euro' encoding LATIN9
> template
> 'template0';
> ERROR: could not convert locale name "fr_FR(at)euro" to language tag:
> U_ILLEGAL_ARGUMENT_ERROR
ICU 63 and earlier convert it without error to the language tag 'fr-FR-
u-cu-eur', which is correct. ICU 64 removed support for transforming
some locale variants, because apparently they think those variants are
obsolete:
https://unicode-org.atlassian.net/browse/ICU-22268
https://unicode-org.atlassian.net/browse/ICU-20187
(Aside: how obsolete are those variants?)
It's frustrating that they'd remove such transformations from the
canonicalization process.
Fortunately, it looks like it's easy enough to do the transformation
ourselves. The only problematic format is '(dot)(dot)(dot)(at)VARIANT'. The other
format 'fr_FR_EURO' doesn't seem to be a valid glibc locale name[1] and
windows seems to use BCP 47[2].
And there don't seem to be a lot of variants to handle. ICU 63 only
handles 3 variants, so that's what my patch does. Any unknown variant
between 5 and 8 characters won't throw an error. There could be more
problem cases, but I'm not sure how much of a practical problem they
are.
If we try to keep the meaning of LOCALE to only LC_COLLATE and
LC_CTYPE, that will continue to be confusing for anyone that uses
provider=icu.
Regards,
Jeff Davis
[1]
https://www.gnu.org/software/libc/manual/html_node/Locale-Names.html
[2]
https://learn.microsoft.com/en-us/windows/win32/intl/locale-names
Attachment | Content-Type | Size |
---|---|---|
v3-0001-ICU-do-not-convert-locale-C-to-en-US-u-va-posix.patch | text/x-patch | 6.9 KB |
v3-0002-ICU-support-locale-C-with-the-same-behavior-as-li.patch | text/x-patch | 12.2 KB |
v3-0003-ICU-fix-up-old-libc-style-locale-strings.patch | text/x-patch | 9.6 KB |
v3-0004-Make-LOCALE-apply-to-ICU_LOCALE-for-CREATE-DATABA.patch | text/x-patch | 16.3 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | John Naylor | 2023-04-29 08:09:13 | Re: [PATCH] Clarify the behavior of the system when approaching XID wraparound |
Previous Message | Roberto Mello | 2023-04-28 18:22:34 | Re: Postgres Version want to update from 9.2 to 9.5 version in CentOS 7.9 |