Re: ICU for global collation

From: Marina Polyakova <m(dot)polyakova(at)postgrespro(dot)ru>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, pryzby(at)telsasoft(dot)com, rjuju123(at)gmail(dot)com, daniel(at)manitou-mail(dot)org, AndrewBille(at)gmail(dot)com, michael(at)paquier(dot)xyz, peter(dot)eisentraut(at)enterprisedb(dot)com
Subject: Re: ICU for global collation
Date: 2022-10-21 14:32:38
Message-ID: 727b5d5160f845dcf5e0818e625a6e56@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello!

I discovered an interesting behaviour during installcheck runs when the
cluster was initialized with ICU locale provider:

$ initdb --locale-provider icu --icu-locale en-US -D data &&
pg_ctl -D data -l logfile start

1) The ECPG tests fail because they use the SQL_ASCII encoding [1], the
database template0 uses the ICU locale provider and SQL_ASCII is not
supported by ICU:

$ make -C src/interfaces/ecpg/ installcheck
...
============== creating database "ecpg1_regression" ==============
ERROR: encoding "SQL_ASCII" is not supported with ICU provider
ERROR: database "ecpg1_regression" does not exist
command failed: "/home/marina/postgresql/master/my/inst/bin/psql" -X -c
"CREATE DATABASE \"ecpg1_regression\" TEMPLATE=template0
ENCODING='SQL_ASCII'" -c "ALTER DATABASE \"ecpg1_regression\" SET
lc_messages TO 'C';ALTER DATABASE \"ecpg1_regression\" SET lc_monetary
TO 'C';ALTER DATABASE \"ecpg1_regression\" SET lc_numeric TO 'C';ALTER
DATABASE \"ecpg1_regression\" SET lc_time TO 'C';ALTER DATABASE
\"ecpg1_regression\" SET bytea_output TO 'hex';ALTER DATABASE
\"ecpg1_regression\" SET timezone_abbreviations TO 'Default';"
"postgres"

2) The option --no-locale in pg_regress is described as "use C locale"
[2]. But in this case the created databases actually use the ICU locale
provider with the ICU cluster locale from template0 (see
diff_check_backend_used_provider.patch):

$ make NO_LOCALE=1 installcheck

In regression.diffs:

diff -U3
/home/marina/postgresql/master/src/test/regress/expected/test_setup.out
/home/marina/postgresql/master/src/test/regress/results/test_setup.out
---
/home/marina/postgresql/master/src/test/regress/expected/test_setup.out 2022-09-27
05:31:27.674628815 +0300
+++
/home/marina/postgresql/master/src/test/regress/results/test_setup.out 2022-10-21
15:09:31.232992885 +0300
@@ -143,6 +143,798 @@
\set filename :abs_srcdir '/data/person.data'
COPY person FROM :'filename';
VACUUM ANALYZE person;
+NOTICE: varstrfastcmp_locale sss->collate_c 0 sss->locale 0xefacd0
+NOTICE: varstrfastcmp_locale sss->locale->provider i
+NOTICE: varstrfastcmp_locale sss->locale->info.icu.locale en-US
...

The patch diff_fix_pg_regress_create_database.patch fixes both issues
for me.

[1]
https://github.com/postgres/postgres/blob/ce20f8b9f4354b46b40fd6ebf7ce5c37d08747e0/src/interfaces/ecpg/test/Makefile#L18
[2]
https://github.com/postgres/postgres/blob/ce20f8b9f4354b46b40fd6ebf7ce5c37d08747e0/src/test/regress/pg_regress.c#L1992

--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment Content-Type Size
diff_check_backend_used_provider.patch text/x-diff 9.7 KB
diff_fix_pg_regress_create_database.patch text/x-diff 1.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bharath Rupireddy 2022-10-21 15:07:18 Re: Crash after a call to pg_backup_start()
Previous Message Robert Haas 2022-10-21 14:17:14 Re: Avoid memory leaks during base backups