Re: order by different on mac vs linux

From: Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>
To: Wes James <comptekki(at)gmail(dot)com>
Cc: Samuel Gendler <sgendler(at)ideasculptor(dot)com>, pgsql-sql(at)postgresql(dot)org
Subject: Re: order by different on mac vs linux
Date: 2012-05-17 02:24:06
Message-ID: CAOR=d=2dEv2Cq7DWGoecVrYUr_Fm5r5cMbug55b_rNZfwVKcnQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

On Wed, May 16, 2012 at 7:58 PM, Wes James <comptekki(at)gmail(dot)com> wrote:
>
>
> On Wed, May 16, 2012 at 5:00 PM, Samuel Gendler <sgendler(at)ideasculptor(dot)com>
> wrote:
>>
>>
>>
>> On Wed, May 16, 2012 at 3:46 PM, Wes James <comptekki(at)gmail(dot)com> wrote:
>>>
>>>
>>>
>>> On Mon, May 14, 2012 at 5:00 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>>>
>>>> Wes James <comptekki(at)gmail(dot)com> writes:
>>>> > Why is there a different order on the different platforms.
>>>>
>>>> This is not exactly unusual.  You should first check to see if
>>>> lc_collate is set differently in the two installations --- but even if
>>>> it's the same, there are often platform-specific interpretations of
>>>> the sorting rules.  (Not to mention that OS X is flat out broken when
>>>> it comes to sorting UTF8 data ...)
>>>>
>>>
>>> I just ran these:
>>>
>>> linux:
>>>
>>> on linux
>>>
>>> # SELECT CASE WHEN 'apache' > '!yada' THEN 'TRUE' ELSE 'FALSE' END FROM
>>> pg_user;
>>>  case
>>> -------
>>>  FALSE
>>> (1 row)
>>>
>>> # show lc_collate;
>>>
>>>  lc_collate
>>> -------------
>>>  en_US.UTF-8
>>> (1 row)
>>>
>>> ------------------------
>>>
>>> on mac os x:
>>>
>>> # SELECT CASE WHEN 'apache' > '!yada' THEN 'TRUE' ELSE 'FALSE' END FROM
>>> pg_user;
>>>  case
>>> ------
>>>  TRUE
>>> (1 row)
>>>
>>> # show lc_collate;
>>>  lc_collate
>>> -------------
>>>  en_US.UTF-8
>>> (1 row)
>>>
>>>
>>> -----------------------
>>>
>>> Why is the linux postgres saying false with the lc_collage set the way it
>>> is?
>>
>>
>> That's the point - UTF-8 collation is just completely broken under OS X.
>>  There's much previous discussion of the topic on this list and elsewhere.
>>  If you're developing on OS X but running linux and you are mostly using an
>> ascii character set in your test dataset, set your development OS X boxes to
>> use C collation, which will basically do what you expect it do do until you
>> start throwing multibyte characters at it.  If you can't constrain your
>> testing/development dataset in such a manner and collation order really
>> matters during development, then you probably shouldn't develop on OS X.  I
>> spent a fair amount of time investigating how to define a new charset in
>> what proved to ultimately be a futile attempt to redefine UTF-8 on OSX to
>> behave just like ti does on Linux.  I just gave it up after wasting a few
>> too many hours on it. It may be possible to do it, but the return on
>> invested time was non-existent for me so I abandoned my effort.
>>
>
> Why are people saying os x is broken in my case?  Looking
> at  http://www.utf8-chartable.de/  and  http://www.asciitable.com/  for sort
> order, ! should come before apache.  On os x it is correct, on ubuntu linux
> it is not.  In the order by output per my previous emails, it is correct on
> os x, but no on linux.  Why do people keep saying os x is broken, when it is
> doing the correct thing from what I've seen?

You're confusing encoding with locale. UTF-8 is an encoding. If
there were no Locale (i.e. it was set to C, simple byte ordering) then
you'd be correct. HOWEVER, a locale is a different animal altogether.
For instance, most locales ignore many characters when it comes to
sort ordering. Such as spaces:

smarlowe=# create table test (i text);
CREATE TABLE
smarlowe=# insert into test values ('abc'),(' abc'),('def'),(' def');
INSERT 0 4
smarlowe=# select * from test order by i;
i
------
abc
abc
def
def
(4 rows)

Note the spaces are ignored for sorting purposes.

In response to

Browse pgsql-sql by date

  From Date Subject
Next Message Andreas 2012-05-19 19:04:38 Need help in grouping records
Previous Message Wes James 2012-05-17 01:58:10 Re: order by different on mac vs linux