Re: ICU for global collation

From: Marina Polyakova <m(dot)polyakova(at)postgrespro(dot)ru>
To: Justin Pryzby <pryzby(at)telsasoft(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Daniel Verite <daniel(at)manitou-mail(dot)org>, AndrewBille(at)gmail(dot)com, michael(at)paquier(dot)xyz
Subject: Re: ICU for global collation
Date: 2022-09-13 05:34:16
Message-ID: 60facfef2a8d54b227500260e86d823e@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2022-09-09 19:46, Justin Pryzby wrote:
> In pg14:
> |postgres=# create database a LC_COLLATE C LC_CTYPE C LOCALE C;
> |ERROR: conflicting or redundant options
> |DETAIL: LOCALE cannot be specified together with LC_COLLATE or
> LC_CTYPE.
>
> In pg15:
> |postgres=# create database a LC_COLLATE "en_US.UTF-8" LC_CTYPE
> "en_US.UTF-8" LOCALE "en_US.UTF-8" ;
> |CREATE DATABASE
>
> f2553d430 actually relaxed the restriction by removing this check:
>
> - if (dlocale && (dcollate || dctype))
> - ereport(ERROR,
> - (errcode(ERRCODE_SYNTAX_ERROR),
> - errmsg("conflicting or redundant
> options"),
> - errdetail("LOCALE cannot be specified
> together with LC_COLLATE or LC_CTYPE.")));
>
> But isn't the right fix to do the corresponding thing in createdb
> (relaxing the frontend restriction rather than reverting its relaxation
> in the backend).
>
> diff --git a/src/bin/scripts/createdb.c b/src/bin/scripts/createdb.c
> index e523e58b218..5b80e56dfd9 100644
> --- a/src/bin/scripts/createdb.c
> +++ b/src/bin/scripts/createdb.c
> @@ -159,15 +159,10 @@ main(int argc, char *argv[])
> exit(1);
> }
>
> - if (locale)
> - {
> - if (lc_ctype)
> - pg_fatal("only one of --locale and --lc-ctype can be specified");
> - if (lc_collate)
> - pg_fatal("only one of --locale and --lc-collate can be specified");
> + if (locale && !lc_ctype)
> lc_ctype = locale;
> + if (locale && !lc_collate)
> lc_collate = locale;
> - }
>
> if (encoding)
> {

I agree with you that it is more comfortable and more similar to what
has already been done in initdb. IMO it would be easier to do it like
this:

diff --git a/src/bin/scripts/createdb.c b/src/bin/scripts/createdb.c
index
e523e58b2189275dc603a06324a2f28b0f49d8b7..a1482df3d981a680dd3322052e7c03ddacc8dc26
100644
--- a/src/bin/scripts/createdb.c
+++ b/src/bin/scripts/createdb.c
@@ -161,12 +161,10 @@ main(int argc, char *argv[])

if (locale)
{
- if (lc_ctype)
- pg_fatal("only one of --locale and --lc-ctype can be specified");
- if (lc_collate)
- pg_fatal("only one of --locale and --lc-collate can be specified");
- lc_ctype = locale;
- lc_collate = locale;
+ if (!lc_ctype)
+ lc_ctype = locale;
+ if (!lc_collate)
+ lc_collate = locale;
}

if (encoding)

Should we change the behaviour of createdb and CREATE DATABASE in
previous major versions?..

> BTW it's somewhat crummy that it uses a string comparison, so if you
> write "UTF8" without a dash, it says this; it took me a few minutes to
> see the difference...
>
> postgres=# create database a LC_COLLATE "en_US.UTF8" LC_CTYPE
> "en_US.UTF8" LOCALE "en_US.UTF8";
> ERROR: new collation (en_US.UTF8) is incompatible with the collation
> of the template database (en_US.UTF-8)

Perhaps we could check the locale itself with the function
normalize_libc_locale_name (collationcmds.c). But ISTM that the current
check is a safety net in case the function pg_get_encoding_from_locale
(chklocale.c) returns -1 or PG_SQL_ASCII...

--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrey Borodin 2022-09-13 05:38:13 Re: pg_stat_statements locking
Previous Message Drouvot, Bertrand 2022-09-13 05:30:23 Re: Query Jumbling for CALL and SET utility statements