Re: Patch for collation using ICU

From: Palle Girgensohn <girgen(at)pingpong(dot)net>
To: Stephan Szabo <sszabo(at)megazone(dot)bigpanda(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, John Hansen <john(at)geeknet(dot)com(dot)au>, Andrew Dunstan <andrew(at)dunslane(dot)net>
Subject: Re: Patch for collation using ICU
Date: 2005-03-27 00:50:03
Message-ID: EFD91E88714036D5A5EBA996@palle.girgensohn.se
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

--On lördag, mars 26, 2005 08.16.01 -0800 Stephan Szabo
<sszabo(at)megazone(dot)bigpanda(dot)com> wrote:

> On Sat, 26 Mar 2005, Palle Girgensohn wrote:
>> I've noticed a couple of things about using the ICU patch vs. pristine
>> pg-8.0.1:
>>
>> - ORDER BY is case insensitive when using ICU. This might break the SQL
>> standard (?), but sure is nice :)
>
> Err, I think if your system implements strcoll correctly 8.0.1 can do this
> if the chosen collation is set up that way (or at least naive tests I've
> done seem to imply that). Or are you speaking about C locale?

No, I doubt this.

Example: set up a cluster:
$ initdb -E LATIN1 --locale=sv_SE.ISO8859-1
$ createdb foo
CREATE DATABASE
$ psql foo
foo=# create table bar (val text);
CREATE TABLE
foo=# insert into bar values ('aaa');
INSERT 18354409 1
foo=# insert into bar values ('BBB');
INSERT 18354412 1
foo=# select val from bar order by val;
val
-----
BBB
aaa
(2 rows)

Order by is not case insensitive. It shouldn't be for any system, AFAIK. As
John Hansen noted, this might be a bad thing. I'm not sure about that,
though...

As for general collation of unicode, the reason for me to use ICU is that
my system does not support strcoll correctly for multibyte locales, as I
mentioned earlier. I also noted that even for systems that do handle
strcoll correctly for unicode, ICU claims to be a couple of magnitudes
faster, so this patch might be useful for other systems (read Linux) as
well. See previous emails for details.

Regards,
Palle

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Hannu Krosing 2005-03-27 01:34:03 Re: Patch for collation using ICU
Previous Message Josh Berkus 2005-03-26 23:56:43 Re: Bug 1500