Re: Distinct oddity

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Maximilian Tyrtania <maximilian(dot)tyrtania(at)onlinehome(dot)de>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, pgsql-sql(at)postgresql(dot)org
Subject: Re: Distinct oddity
Date: 2009-05-13 15:47:56
Message-ID: 368.1242229676@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

I wrote:
> Maximilian Tyrtania <maximilian(dot)tyrtania(at)onlinehome(dot)de> writes:
>> am 12.05.2009 19:23 Uhr schrieb Alvaro Herrera unter
>> alvherre(at)commandprompt(dot)com:
>>> What platform are you using anyway?

>> Mac OS 10.4.11

> I have some vague recollection that UTF8-using locales don't actually
> work well on OSX ... check the archives ...

OK, the thread (or one of the threads) I was remembering is here:
http://archives.postgresql.org//pgsql-general/2005-11/msg00047.php

I am too lazy to boot up 10.4 right now, but looking on a 10.5.6 machine
indicates that Apple is still being pretty lame about this:

$ ls -l /usr/share/locale/de_DE
total 40
lrwxr-xr-x 1 root wheel 28 Feb 27 2008 LC_COLLATE -> ../la_LN.US-ASCII/LC_COLLATE
lrwxr-xr-x 1 root wheel 17 Feb 27 2008 LC_CTYPE -> ../UTF-8/LC_CTYPE
drwxr-xr-x 3 root wheel 102 Feb 27 2008 LC_MESSAGES
lrwxr-xr-x 1 root wheel 30 Feb 27 2008 LC_MONETARY -> ../de_DE.ISO8859-1/LC_MONETARY
lrwxr-xr-x 1 root wheel 29 Feb 27 2008 LC_NUMERIC -> ../de_DE.ISO8859-1/LC_NUMERIC
-r--r--r-- 1 root wheel 370 Jan 2 2008 LC_TIME

So it looks like they understand UTF-8 to the extent of supporting
character classification fairly well, but sort order is "just ASCII".
I'm not sure exactly how that might result in the observed odd behavior
of DISTINCT, but I bet it's causing it somehow. You'd probably have
better luck in the de_DE.ISO8859-1 or de_DE.ISO8859-15 locales.

regards, tom lane

In response to

Browse pgsql-sql by date

  From Date Subject
Next Message Glenn Maynard 2009-05-13 18:48:44 Re: Distinct oddity
Previous Message Tom Lane 2009-05-13 14:47:58 Re: Distinct oddity