BUG #16570: Collation not working

From: PG Bug reporting form <noreply(at)postgresql(dot)org>
To: pgsql-bugs(at)lists(dot)postgresql(dot)org
Cc: arnaud(dot)perrier(at)gmail(dot)com
Subject: BUG #16570: Collation not working
Date: 2020-08-03 07:14:06
Message-ID: 16570-58cc04e1a6ef3c3f@postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

The following bug has been logged on the website:

Bug reference: 16570
Logged by: Arnaud Perrier
Email address: arnaud(dot)perrier(at)gmail(dot)com
PostgreSQL version: 12.3
Operating system: Windows / Linux
Description:

I've installed Postgresql 12 on both Wnindows and Linux CentOS 8.

For my project, I needed to create my own ICU Collation and apply it to all
character columns (either in column creation or order by requests =
preferred).

Before doing so, I tried to make a simple test case to check if it's works
as expected ... and it's not.

A simple table with some numeric and alphabetic data

DROP TABLE IF EXISTS TBL;
CREATE TABLE TBL ( TEXT1 CHARACTER(5), TEXT2 CHARACTER(5) );
INSERT INTO TBL VALUES
('aaaaa', 'aaaaa')
,('aaaaa', '00000')
,('aaaaa', 'bbbbb')
,('aaaaa', '11111')

,('bbbbb', '22222')
,('00000', '22222')
,('ccccc', '22222')
,('11111', '22222');

I've installed Postgresql 12 on both Wnindows and Linux CentOS 8.

For my project, I needed to create my own ICU Collation and apply it to all
character columns (either in column creation or order by requests =
preferred).

Before doing so, I tried to make a simple test case to check if it's works
as expected ... and it's not.

A simple table with some numeric and alphabetic data

DROP TABLE IF EXISTS TBL;
CREATE TABLE TBL ( TEXT1 CHARACTER(5), TEXT2 CHARACTER(5) );
INSERT INTO TBL VALUES
('aaaaa', 'aaaaa')
,('aaaaa', '00000')
,('aaaaa', 'bbbbb')
,('aaaaa', '11111')

,('bbbbb', '22222')
,('00000', '22222')
,('ccccc', '22222')
,('11111', '22222');

The collation to order digits after latin characters from the official
documentation https://www.postgresql.org/docs/12/collation.html

CREATE COLLATION digitslast (provider = icu, locale =
'en-u-kr-latn-digit');
CREATE COLLATION digitslast (provider = icu, locale =
'en(at)colReorder=latn-digit');
Sort digits after Latin letters. (The default is digits before
letters.)

I've installed Postgresql 12 on both Wnindows and Linux CentOS 8.

For my project, I needed to create my own ICU Collation and apply it to all
character columns (either in column creation or order by requests =
preferred).

Before doing so, I tried to make a simple test case to check if it's works
as expected ... and it's not.

A simple table with some numeric and alphabetic data

DROP TABLE IF EXISTS TBL;
CREATE TABLE TBL ( TEXT1 CHARACTER(5), TEXT2 CHARACTER(5) );
INSERT INTO TBL VALUES
('aaaaa', 'aaaaa')
,('aaaaa', '00000')
,('aaaaa', 'bbbbb')
,('aaaaa', '11111')

,('bbbbb', '22222')
,('00000', '22222')
,('ccccc', '22222')
,('11111', '22222');
The collation to order digits after latin characters from the official
documentation https://www.postgresql.org/docs/12/collation.html

CREATE COLLATION digitslast (provider = icu, locale =
'en-u-kr-latn-digit');
CREATE COLLATION digitslast (provider = icu, locale =
'en(at)colReorder=latn-digit');
Sort digits after Latin letters. (The default is digits before
letters.)

The testing requests

1/ SELECT * FROM TBL;
2/ SELECT * FROM TBL ORDER BY TEXT1, TEXT2;
3/ SELECT * FROM TBL ORDER BY TEXT1 COLLATE digitslast;
4/ SELECT * FROM TBL ORDER BY TEXT2 COLLATE digitslast;
5/ SELECT * FROM TBL ORDER BY TEXT1 COLLATE digitslast, TEXT2 COLLATE
digitslast;
6/ SELECT * FROM TBL ORDER BY TEXT2 COLLATE digitslast, TEXT1 COLLATE
digitslast;
The results on Windows = collation not works at all

1/ OK = TEXT1 + TEXT2 = digits before letters
2/ OK = TEXT1 + TEXT2 = digits before letters
3/ KO = TEXT1 + TEXT2 = digits before letters
4/ KO = TEXT1 + TEXT2 = digits before letters
5/ KO = TEXT1 + TEXT2 = digits before letters
6/ KO = TEXT1 + TEXT2 = digits before letters
The results on Linux Centos 8 = collation works only for 1st column from
ORDER BY

1/ OK = TEXT1 + TEXT2 = digits before letters
2/ OK = TEXT1 + TEXT2 = digits before letters
3/ OK = TEXT1 = digits after letters + TEXT2 = digits before letters
4/ OK = TEXT1 = digits before letters + TEXT2 = digits after letters
5/ KO = TEXT1 = digits after letters + TEXT2 = digits before letters
6/ KO = TEXT1 = digits before letters + TEXT2 = digits after letters
If I applied the COLLATE clause on the table columns, it's not working
also.

ICU 53 is packaged to all Postgresql 12 installation and --with-icu argument
is present on Postgresql configuration.

Does anybody has any clue to make this simple test case works ?

Post also on stack overflow :
https://stackoverflow.com/questions/63054543/postgresql-12-collation-not-working-windows-linux

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message PG Bug reporting form 2020-08-03 07:26:09 BUG #16571: "time with time zone" type can input '00:00:00+1559'::time with time zone
Previous Message PG Bug reporting form 2020-08-03 03:28:27 BUG #16569: Document "59.2. Built-in Operator Classes" have a clerical error