Re: BUG #15651: Collation setting en_US.utf8 breaking sort order

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: kaleb(dot)akalework(at)asg(dot)com
Cc: pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #15651: Collation setting en_US.utf8 breaking sort order
Date: 2019-02-22 18:03:26
Message-ID: 17689.1550858606@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

PG Bug reporting form <noreply(at)postgresql(dot)org> writes:
> I have PostgresSQL database on Windows. I created database with Collation of
> en_US.utf8.

Really? AFAIK, Windows doesn't support collation names that look like
that.

> Then I created table (The steps to reproduce are below). I
> inserted a few rows into this table one of which was row with special
> characters "~!(at)#$^&(". The insert worked fine but then when I do a select on
> the column for values >=' ' (Space), I get back all the rows except for the
> row that contains
> "~!(at)#$^&(" .

This appears to be the intended behavior of en_US sorting.
On a Linux machine I can reproduce it outside Postgres:

$ LANG=C sort stuff.txt

AAA
BAA
CAA
DAA
~!(at)#$^&(
$ LANG=en_US sort stuff.txt
~!(at)#$^&(

AAA
BAA
CAA
DAA

(The first line in my test file contains one space.)

regards, tom lane

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Kaleb Akalework 2019-02-22 18:18:16 RE: BUG #15651: Collation setting en_US.utf8 breaking sort order
Previous Message Amit Langote 2019-02-22 17:55:32 Re: 'update returning *' returns 0 columns instead of empty row with 2 columns when (i) no rows updated and (ii) when applied to a partitioned table with sub-partition