Re: order by different on mac vs linux

From: Samuel Gendler <sgendler(at)ideasculptor(dot)com>
To: Wes James <comptekki(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-sql(at)postgresql(dot)org
Subject: Re: order by different on mac vs linux
Date: 2012-05-16 23:00:31
Message-ID: CAEV0TzAAX_oG=ebof9e7W2TK8k1MUQ47XfvMi089=hFvtccszQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

On Wed, May 16, 2012 at 3:46 PM, Wes James <comptekki(at)gmail(dot)com> wrote:

>
>
> On Mon, May 14, 2012 at 5:00 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
>> Wes James <comptekki(at)gmail(dot)com> writes:
>> > Why is there a different order on the different platforms.
>>
>> This is not exactly unusual. You should first check to see if
>> lc_collate is set differently in the two installations --- but even if
>> it's the same, there are often platform-specific interpretations of
>> the sorting rules. (Not to mention that OS X is flat out broken when
>> it comes to sorting UTF8 data ...)
>>
>>
> I just ran these:
>
> linux:
>
> on linux
>
> # SELECT CASE WHEN 'apache' > '!yada' THEN 'TRUE' ELSE 'FALSE' END FROM
> pg_user;
> case
> -------
> FALSE
> (1 row)
>
> # show lc_collate;
>
> lc_collate
> -------------
> en_US.UTF-8
> (1 row)
>
> ------------------------
>
> on mac os x:
>
> # SELECT CASE WHEN 'apache' > '!yada' THEN 'TRUE' ELSE 'FALSE' END FROM
> pg_user;
> case
> ------
> TRUE
> (1 row)
>
> # show lc_collate;
> lc_collate
> -------------
> en_US.UTF-8
> (1 row)
>
>
> -----------------------
>
> Why is the linux postgres saying false with the lc_collage set the way it
> is?
>

That's the point - UTF-8 collation is just completely broken under OS X.
There's much previous discussion of the topic on this list and elsewhere.
If you're developing on OS X but running linux and you are mostly using an
ascii character set in your test dataset, set your development OS X boxes
to use C collation, which will basically do what you expect it do do until
you start throwing multibyte characters at it. If you can't constrain your
testing/development dataset in such a manner and collation order really
matters during development, then you probably shouldn't develop on OS X. I
spent a fair amount of time investigating how to define a new charset in
what proved to ultimately be a futile attempt to redefine UTF-8 on OSX to
behave just like ti does on Linux. I just gave it up after wasting a few
too many hours on it. It may be possible to do it, but the return on
invested time was non-existent for me so I abandoned my effort.

In response to

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message Wes James 2012-05-17 01:58:10 Re: order by different on mac vs linux
Previous Message Wes James 2012-05-16 22:46:05 Re: order by different on mac vs linux