Re: Patch for collation using ICU

From: Palle Girgensohn <girgen(at)pingpong(dot)net>
To: Stephan Szabo <sszabo(at)megazone(dot)bigpanda(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, John Hansen <john(at)geeknet(dot)com(dot)au>, Andrew Dunstan <andrew(at)dunslane(dot)net>
Subject: Re: Patch for collation using ICU
Date: 2005-03-27 03:12:19
Message-ID: B3322977A55B6FBA7897A9EA@palle.girgensohn.se
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

--On lördag, mars 26, 2005 17.40.01 -0800 Stephan Szabo
<sszabo(at)megazone(dot)bigpanda(dot)com> wrote:

>
> On Sun, 27 Mar 2005, Palle Girgensohn wrote:
>
>>
>>
>> --On lördag, mars 26, 2005 08.16.01 -0800 Stephan Szabo
>> <sszabo(at)megazone(dot)bigpanda(dot)com> wrote:
>>
>> > On Sat, 26 Mar 2005, Palle Girgensohn wrote:
>> >> I've noticed a couple of things about using the ICU patch vs. pristine
>> >> pg-8.0.1:
>> >>
>> >> - ORDER BY is case insensitive when using ICU. This might break the
>> >> SQL standard (?), but sure is nice :)
>> >
>> > Err, I think if your system implements strcoll correctly 8.0.1 can do
>> > this if the chosen collation is set up that way (or at least naive
>> > tests I've done seem to imply that). Or are you speaking about C
>> > locale?
>>
>> No, I doubt this.
>>
>> Example: set up a cluster:
>> $ initdb -E LATIN1 --locale=sv_SE.ISO8859-1
>> $ createdb foo
>> CREATE DATABASE
>> $ psql foo
>> foo=# create table bar (val text);
>> CREATE TABLE
>> foo=# insert into bar values ('aaa');
>> INSERT 18354409 1
>> foo=# insert into bar values ('BBB');
>> INSERT 18354412 1
>> foo=# select val from bar order by val;
>> val
>> -----
>> BBB
>> aaa
>> (2 rows)
>>
>>
>> Order by is not case insensitive. It shouldn't be for any system, AFAIK.
>> As
>
> It is on my machine... for the same test:
>
> foo=# select val from bar order by val;
> val
> -----
> aaa
> BBB
> (2 rows)
>
> I think this just implies even greater breakage of either the collation or
> strcoll on the system you're trying on. ;) Which, of course, is a fairly
> reasonable reason to offer an alternative. Especially if it's generically
> useful.

Interesting! Indeed, just tried on an old Linux Redhat system... BTW,
that's pretty odd for a unix system. "ls -l" sorts aaa before BBB, I've
never seen the likes of it! Call me old fashion if you like ;-)

Still, as you say, FreeBSD does it capital letters first, and does not
handle unicode locales' collation, so I need an alternative. Perhaps the
best way would be to inject ICU into BSD instead :-)

/Palle

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Qingqing Zhou 2005-03-27 05:26:53 Re: Postgres recoverey for deleted row of data
Previous Message Palle Girgensohn 2005-03-27 02:24:13 Re: Patch for collation using ICU