Re: ICU for global collation

From: "Daniel Verite" <daniel(at)manitou-mail(dot)org>
To: "Julien Rouhaud" <rjuju123(at)gmail(dot)com>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Pgsql-Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ICU for global collation
Date: 2022-03-15 19:48:00
Message-ID: 5278e460-32f5-4ae2-9271-7ce028b8b197@manitou-mail.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Julien Rouhaud wrote:

> > > While on that topic, the doc should probably mention that default ICU
> > > collations can only be deterministic.
> >
> > Well, there is no option to do otherwise, so I'm not sure where/how to
> > mention that. We usually don't document options that don't exist. ;-)
>
> Sure, but I'm afraid that users may still be tempted to use ICU locales like
> und-u-ks-level2 from the case_insensitive example in the doc and hope that
> it will work accordingly.

+1.

The CREATE DATABASE doc says this currently:

icu_locale

Specifies the ICU locale ID if the ICU locale provider is used.

ISTM that we need to say explicitly that this locale will be used by
default to compare all collatable strings, except that it's overruled
by a bytewise comparison to break ties in case of equality.

The idea is to describe what the backend will do with the setting
rather than saying that we don't have a nondeterministic option.

Best regards,
--
Daniel Vérité
https://postgresql.verite.pro/
Twitter: @DanielVerite

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Matthias van de Meent 2022-03-15 19:48:58 Re: Non-replayable WAL records through overflows and >MaxAllocSize lengths
Previous Message Finnerty, Jim 2022-03-15 19:47:09 Re: ICU for global collation