Re: BUG #15651: Collation setting en_US.utf8 breaking sort order

From: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, kaleb(dot)akalework(at)asg(dot)com
Cc: pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #15651: Collation setting en_US.utf8 breaking sort order
Date: 2019-02-23 08:36:39
Message-ID: 1f6f60bf-79a4-b739-5e04-8360085cf3d8@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On 2019-02-22 19:03, Tom Lane wrote:
> $ LANG=en_US sort stuff.txt
> ~!(at)#$^&(
>
> AAA
> BAA
> CAA
> DAA

With ICU (COLLATE "und-x-icu"), I get the line with the space first. I
took a bit of a look around the various Unicode documents and I don't
find anything that would defend the glibc behavior.

<obscure detail>
However, since some of those special characters are variable collating
elements and some are not, there might well be an explanation.
</obscure detail>

So, maybe try ICU.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Daniel Verite 2019-02-23 17:30:01 Re: BUG #15651: Collation setting en_US.utf8 breaking sort order
Previous Message Andres Freund 2019-02-22 22:28:46 Re: BUG #15636: PostgreSQL 11.1 pg_basebackup backup to a CIFS destination throws fsync error at end of backup