Re: ICU for global collation

From: Marina Polyakova <m(dot)polyakova(at)postgrespro(dot)ru>
To: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>
Cc: Justin Pryzby <pryzby(at)telsasoft(dot)com>, pgsql-hackers(at)postgresql(dot)org, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Daniel Verite <daniel(at)manitou-mail(dot)org>, AndrewBille(at)gmail(dot)com, michael(at)paquier(dot)xyz
Subject: Re: ICU for global collation
Date: 2022-09-13 15:51:39
Message-ID: 38d2aac40a991b20adf032e509c6eff5@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2022-09-13 15:41, Peter Eisentraut wrote:
> On 13.09.22 07:34, Marina Polyakova wrote:
>> I agree with you that it is more comfortable and more similar to what
>> has already been done in initdb. IMO it would be easier to do it like
>> this:
>>
>> diff --git a/src/bin/scripts/createdb.c b/src/bin/scripts/createdb.c
>> index
>> e523e58b2189275dc603a06324a2f28b0f49d8b7..a1482df3d981a680dd3322052e7c03ddacc8dc26
>> 100644
>> --- a/src/bin/scripts/createdb.c
>> +++ b/src/bin/scripts/createdb.c
>> @@ -161,12 +161,10 @@ main(int argc, char *argv[])
>>
>>      if (locale)
>>      {
>> -        if (lc_ctype)
>> -            pg_fatal("only one of --locale and --lc-ctype can be
>> specified");
>> -        if (lc_collate)
>> -            pg_fatal("only one of --locale and --lc-collate can be
>> specified");
>> -        lc_ctype = locale;
>> -        lc_collate = locale;
>> +        if (!lc_ctype)
>> +            lc_ctype = locale;
>> +        if (!lc_collate)
>> +            lc_collate = locale;
>>      }
>>
>>      if (encoding)
>
> done that way

Thank you!

>>> BTW it's somewhat crummy that it uses a string comparison, so if you
>>> write "UTF8" without a dash, it says this; it took me a few minutes
>>> to
>>> see the difference...
>>>
>>> postgres=# create database a LC_COLLATE "en_US.UTF8" LC_CTYPE
>>> "en_US.UTF8" LOCALE "en_US.UTF8";
>>> ERROR:  new collation (en_US.UTF8) is incompatible with the collation
>>> of the template database (en_US.UTF-8)
>>
>> Perhaps we could check the locale itself with the function
>> normalize_libc_locale_name (collationcmds.c). But ISTM that the
>> current check is a safety net in case the function
>> pg_get_encoding_from_locale (chklocale.c) returns -1 or
>> PG_SQL_ASCII...
>
> This is not new behavior in PG15, is it?

No, it has always existed [1] AFAICS..

[1]
https://github.com/postgres/postgres/commit/61d967498802ab86d8897cb3c61740d7e9d712f6

--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2022-09-13 16:17:50 Re: Modernizing our GUC infrastructure
Previous Message Jonathan S. Katz 2022-09-13 15:47:16 Re: First draft of the PG 15 release notes