RE: BUG #15651: Collation setting en_US.utf8 breaking sort order

From: "Daniel Verite" <daniel(at)manitou-mail(dot)org>
To: "Kaleb Akalework" <kaleb(dot)akalework(at)asg(dot)com>
Cc: "Peter Geoghegan" <pg(at)bowt(dot)ie>,"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>,"PostgreSQL mailing lists" <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: RE: BUG #15651: Collation setting en_US.utf8 breaking sort order
Date: 2019-02-23 17:48:55
Message-ID: 4049e84f-d114-4f48-bec2-432c715e2ff9@manitou-mail.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Kaleb Akalework wrote:

> Ok so if this is intended behavior of UTF8 then I understand. My last
> question then would be if I use a collation setting of C, does it mean I
> won't be able to support multiple languages?

You seem to want to the sort order of C, but be aware that you might
have to decide whether you want this:

=> select upper('é' collate "C");
upper
-------
é
(1 row)

or that:

=> select upper('é' collate "en_US");
upper
-------
É
(1 row)

To get the sort order of C but the interpretation of characters closer
to what you'd expect from Unicode, it's possible for the database
to have LC_COLLATE to "C", and LC_CTYPE to, say, en_US.UTF-8.
See CREATE DATABASE.

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message PG Bug reporting form 2019-02-23 20:43:20 BUG #15653: pg_detoast_datum_packed problem
Previous Message Daniel Verite 2019-02-23 17:30:01 Re: BUG #15651: Collation setting en_US.utf8 breaking sort order