Re: ICU for global collation

From: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>
To: Marina Polyakova <m(dot)polyakova(at)postgrespro(dot)ru>
Cc: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org, pryzby(at)telsasoft(dot)com, rjuju123(at)gmail(dot)com, daniel(at)manitou-mail(dot)org, AndrewBille(at)gmail(dot)com, michael(at)paquier(dot)xyz
Subject: Re: ICU for global collation
Date: 2022-09-20 09:59:04
Message-ID: 218b963e-dd66-626a-abd8-962afb587876@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 17.09.22 10:33, Marina Polyakova wrote:
> Thanks to Kyotaro Horiguchi review we found out that there're
> interesting cases due to the order of some ICU checks:
>
> 1. ICU locale vs supported encoding:
>
> 1.1.
>
> On 2022-09-15 09:52, Kyotaro Horiguchi wrote:
>> If I executed initdb as follows, I would be told to specify
>> --icu-locale option.
>>
>>> $ initdb --encoding sql-ascii --locale-provider icu hoge
>>> ...
>>> initdb: error: ICU locale must be specified
>>
>> However, when I reran the command, it complains about incompatible
>> encoding this time.  I think it's more user-friendly to check for the
>> encoding compatibility before the check for missing --icu-locale
>> option.

This a valid point, but it would require quite a bit of work to move all
those checks around and re-verify the result, so I don't want to do it
in PG15.

> 1.2. (ok?)
>
> $ initdb --encoding sql-ascii --icu-locale en-US hoge
> initdb: error: --icu-locale cannot be specified unless locale provider
> "icu" is chosen
>
> $ initdb --encoding sql-ascii --icu-locale en-US --locale-provider icu hoge
> ...
> initdb: error: encoding mismatch
> initdb: detail: The encoding you selected (SQL_ASCII) is not supported
> with the ICU provider.
> initdb: hint: Rerun initdb and either do not specify an encoding
> explicitly, or choose a matching combination.
>
> $ createdb --encoding sql-ascii --icu-locale en-US hoge
> createdb: error: database creation failed: ERROR:  ICU locale cannot be
> specified unless locale provider is ICU
> $ createdb --encoding sql-ascii --icu-locale en-US --locale-provider icu
> hoge
> createdb: error: database creation failed: ERROR:  encoding "SQL_ASCII"
> is not supported with ICU provider

I don't see a problem here.

> 2. For builds without ICU:
>
> 2.1.
>
> $ initdb --locale-provider icu hoge
> ...
> initdb: error: ICU locale must be specified
>
> $ initdb --locale-provider icu --icu-locale en-US hoge
> ...
> initdb: error: ICU is not supported in this build
>
> $ createdb --locale-provider icu hoge
> createdb: error: database creation failed: ERROR:  ICU locale must be
> specified
>
> $ createdb --locale-provider icu --icu-locale en-US hoge
> createdb: error: database creation failed: ERROR:  ICU is not supported
> in this build
>
> IMO, it would be more user-friendly to inform an unsupported build in
> the first runs too..

Again, this would require reorganizing a bunch of code to get some
cosmetic benefit, which isn't a good idea now for PG15.

> 2.2. (ok?)
> 2.3.

same here

> 3.
>
> The locale provider is ICU, but it has not yet been set from the
> template database:
>
>> $ initdb --locale-provider icu --icu-locale en-US -D data &&
>> pg_ctl -D data -l logfile start &&
>> createdb --icu-locale ru-RU --template template0 mydb
>> ...
>> createdb: error: database creation failed: ERROR:  ICU locale cannot be
>> specified unless locale provider is ICU

Please see attached patch for a fix. Does that work for you?

Attachment Content-Type Size
0001-Improve-ICU-option-handling-in-CREATE-DATABASE.patch text/plain 1.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Korotkov 2022-09-20 10:00:36 Re: Proposal to use JSON for Postgres Parser format
Previous Message Aleksander Alekseev 2022-09-20 09:45:24 Re: Summary function for pg_buffercache