Re: Character expansion with ICU collations

From: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>
To: "Finnerty, Jim" <jfinnert(at)amazon(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Character expansion with ICU collations
Date: 2021-06-09 17:54:54
Message-ID: f7a2284c-9208-665e-d830-34e55e8d6f4d@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 09.06.21 17:31, Finnerty, Jim wrote:
> CREATE COLLATION CI_AS (provider = icu,
> locale=’utf8(at)colStrength=secondary’, deterministic = false);
>
> CREATE TABLE MyTable3
> (
>
>     ID INT IDENTITY(1, 1),
>     Comments VARCHAR(100)
>
> )
>
> INSERT INTO MyTable3 (Comments) VALUES ('strasse')
> INSERT INTO MyTable3 (Comments) VALUES ('straße')
> SELECT * FROM MyTable3 WHERE Comments COLLATE CI_AS = 'strasse'
> SELECT * FROM MyTable3 WHERE Comments COLLATE CI_AS = 'straße'
>
> We would like to control whether each SELECT statement finds both
> records (because the sort key of ‘ß’ equals the sort key of ‘ss’), or
> whether each SELECT statement finds just one record.

You can have these queries return both rows if you use an
accent-ignoring collation, like this example in the documentation:

CREATE COLLATION ignore_accents (provider = icu, locale =
'und-u-ks-level1-kc-true', deterministic = false);

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2021-06-09 17:58:08 Re: Character expansion with ICU collations
Previous Message Peter Eisentraut 2021-06-09 17:37:01 Re: Adjust pg_regress output for new long test names