Re: Collate order on Mac OS X, text with diacritics in UTF-8

From: Craig Ringer <craig(at)postnewspapers(dot)com(dot)au>
To: Martin Flahault <martin(at)billjobs(dot)com>, Craig Ringer <craig(at)postnewspapers(dot)com(dot)au>, pgsql-general(at)postgresql(dot)org
Subject: Re: Collate order on Mac OS X, text with diacritics in UTF-8
Date: 2010-01-14 05:32:04
Message-ID: 4B4EAC54.2070300@postnewspapers.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Martijn van Oosterhout wrote:

>> in a UTF8 text file and use the "sort" command on it, you will have the same wrong output as with PostgreSQL :
>
> Yes, that's the basic idea. Mac OS X apparently provides ICU underneath
> for programs that would like true unicode collation, but there is
> little chance that postgresql will ever use this.

Out of interest: Why not?

Using ICU would permit Pg to be independent of libc's collation rules,
finally permitting things like specifying a specific collation for a
textual sort. It'd make mixing data from different locales in a database
a lot easier (read: possible to do correctly).

Is this just a matter of "nobody cares enough to produce a solid, tested
patch with equivalent performance that doesn't turn people who try to
review it green with disgust" ... or are there specific reasons why
using something like ICU instead of libc's locale support is not
appropriate for Pg?

--
Craig Ringer

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Craig Ringer 2010-01-14 05:55:42 Re: Extremely Slow Cascade Delete Operation
Previous Message Yan Cheng Cheok 2010-01-14 05:26:30 Re: Extremely Slow Cascade Delete Operation