Re: ICU locale validation / canonicalization

From: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>
To: Jeff Davis <pgsql(at)j-davis(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: ICU locale validation / canonicalization
Date: 2023-03-30 06:59:41
Message-ID: 5293249a-a361-5a5a-a61e-4e8049a75837@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 30.03.23 04:33, Jeff Davis wrote:
> Attached is a new version of the final patch, which performs
> canonicalization. I'm not 100% sure that it's wanted, but it still
> seems like a good idea to get the locales into a standard format in the
> catalogs, and if a lot more people start using ICU in v16 (because it's
> the default), then it would be a good time to do it. But perhaps there
> are risks?

I say, let's do it.

I don't think we should show the notice when the canonicalization
doesn't change anything. This is not useful:

+NOTICE: using language tag "und-u-kf-upper" for locale "und-u-kf-upper"

Also, the message should be phrased more from the perspective of the
user instead of using ICU jargon, like

NOTICE: using canonicalized form "%s" for locale specification "%s"

(Still too many big words?)

I don't think the special handling of IsBinaryUpgrade is needed or
wanted. I would hope that with this feature, all old-style locale IDs
would go away, but this way we would keep them forever. If we believe
that canonicalization is safe, then I don't see why we cannot apply it
during binary upgrade.

Needs documentation updates in doc/src/sgml/charset.sgml.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Gustafsson 2023-03-30 07:02:15 Re: [EXTERNAL] Support load balancing in libpq
Previous Message houzj.fnst@fujitsu.com 2023-03-30 06:37:10 RE: Support logical replication of DDLs