Re: interesting finding on order by behaviour

From: "Albe Laurenz" <laurenz(dot)albe(at)wien(dot)gv(dot)at>
To: "Samuel Hwang *EXTERN*" <samuel(at)replicon(dot)com>, <pgsql-general(at)postgresql(dot)org>
Subject: Re: interesting finding on order by behaviour
Date: 2011-07-25 09:24:06
Message-ID: D960CB61B694CF459DCFB4B0128514C206B21125@exadv11.host.magwien.gv.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Samuel Hwang wrote:
> I ran the same tests in SQL Server 2008R2, Oracle10 and PostgreSQL
> 9.0.4 and found something interesting...
>
> set up
> =====
> drop table t1
> create table t1 (f1 varchar(100))
> insert into t1 (f1) values ('AbC')
> insert into t1 (f1) values ('CdE')
> insert into t1 (f1) values ('abc')
> insert into t1 (f1) values ('ABc')
> insert into t1 (f1) values ('cde')
>
> test
> ===
> select * from t1 order by f1
> select min(f1) as min, max(f1) as max from t1
>
> results
> =====
> SQL Server 2008 R2 (with case insensitive data, the ordering follows
> ASCII order)
>
> f1
> ---
> AbC
> abc
> ABc
> cde
> CdE
>
> min max
> ------ -------
> AbC CdE
>
> Oracle 10 (data is case-sensitive, the ordering follows ASCII order)
>
> f1
> ---
> ABc
> AbC
> CdE
> abc
> cde
>
> min max
> ------ -------
> ABc cde
>
> PostgreSQL 9.0.4 (data is case-sensitive, the ordering is ...
> DIFFERENT)
>
> f1
> ---
> abc
> AbC
> ABc
> cde
> CdE
>
> min max
> ------ -------
> abc CdE
>

> The server encoding is set to UTF8 and collation is united states.1252
> The client encoding is Unicode.

I can only speak about Oracle and PostgreSQL.

The problem is that they use different collations.

I don't know what NLS_LANGUAGE is set to in your Oracle session, but I
assume
that it is AMERICAN. You can check with
SELECT value FROM nls_session_parameters WHERE
parameter='NLS_LANGUAGE';

Sorting in Oracle is controled by the NLS_SORT parameter, which by
default
is set to BINARY if NLS_LANGUAGE is AMERICAN, which is why you get ASCII
ordering (in GERMAN, it would be different :^/ ).

PostgreSQL uses the operating system's collation, which in your case
gives
you linguistic ordering.

In Oracle, try something like
ALTER SESSION SET NLS_SORT = 'GENERIC_M';
for a non-binary sorting order, and in PostgreSQL (before 9.1), create
your
database with C collation for binary sorting order.

You can force binary order in PostgreSQL with
SELECT * FROM t1 ORDER BY f USING ~<~;

Yours,
Laurenz Albe

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Yan Chunlu 2011-07-25 09:39:54 Re: streaming replication does not work across datacenter with 20ms latency?
Previous Message Frank Lanitz 2011-07-25 08:31:04 Re: Implementing "thick"/"fat" databases