Re: ICU, locale and collation question

From: Kirk Wolak <wolakk(at)gmail(dot)com>
To: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>
Cc: Oscar Carlberg <oscar(dot)carlberg(at)fortnox(dot)se>, Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: ICU, locale and collation question
Date: 2023-05-10 05:02:46
Message-ID: CACLU5mSqWDVD1JCHy75F9r8ThFgpFgKcY=isDpiKd5L0u1KFvQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Tue, May 9, 2023 at 11:24 AM Peter Eisentraut <
peter(dot)eisentraut(at)enterprisedb(dot)com> wrote:

> On 09.05.23 08:54, Oscar Carlberg wrote:
> > Our initdb setup would then look like this for compatibility;
> > -E 'UTF-8'
> > --locale-provider=icu
> > --icu-locale=sv-SE-x-icu
> > --lc_monetary=sv_SE.UTF-8
> > --lc-numeric=sv_SE.UTF-8
> > --lc-time=sv_SE.UTF-8
> > --lc-messages=en_US.UTF-8
> >
> > Should we still provide createdb with --lc-collate=C and --lc-ctype=C,
> > or should we set those to sv_SE.UTF-8 as well?
>
> You should set those to something other than C. It doesn't matter much
> what exactly, so what you have there is fine.
>
> Setting it to C would for example affect the ability of the text search
> functionality to detect words containing non-ASCII characters.
>
> Doesn't searching LIKE 'abc%' behave much better for C than others. This
was the driving force for choosing C for us.
[EXPLAIN made it clear that it was range bound until 'abd']

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Peter Eisentraut 2023-05-10 05:23:13 Re: ICU, locale and collation question
Previous Message Kirk Wolak 2023-05-10 04:59:54 Re: Return rows in input array's order?