Re: Sorting Problem in UNICODE/german

From: Andreas Seltenreich <andreas+pg(at)gate450(dot)dyndns(dot)org>
To: pgsql-bugs(at)postgresql(dot)org
Subject: Re: Sorting Problem in UNICODE/german
Date: 2005-09-02 07:54:39
Message-ID: 87r7c729c0.fsf@gate450.dyndns.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Klaus Ita schrob:

> On Thu, Sep 01, 2005 at 09:30:15AM -0400, Tom Lane wrote:
>> Klaus Ita <postgres(at)stro(dot)at> writes:
>> > I have tried starting postgres with LC_ALL=de_AT(dot)utf8(at)euro
>> > locale but that did not help.
>>
>
> i did read the docs and am still not quite happy with my sorting results.
> ok initdb has been rerun
>
> made sure, i had the locale:
>
> locale -a
>
> created new db-cluster with
> LC_ALL=de_AT(dot)utf8(at)euro initdb --locale=de_AT(dot)utf8(at)euro -E UNICODE -D /dev/shm/pgutf8
>
> and then still the sorting was not right when i restored another
> UNICODE db.

Well, I used the very same command with 8.0.3 to create a database,
and the sort order was correct:

--8<---------------cut here---------------start------------->8---
scratch=# select w from w order by w;
w
-------------
Abend
Oma
Österreich
Überflieger
Unter
Zetrix
(6 rows)
--8<---------------cut here---------------end--------------->8---

So I guess there was some misconfiguration of your current
client_encoding during import, or maybe the dump of your unicode db
got unexpectedly converted by improper settings during dumping.

> another "funny" thing is:
>
> ita(at)aipc54:~/.mutt$ LC_ALL=de_AT(dot)utf8(at)euro sort /tmp/testfile
> Abend
> Oma
> Ãterreich
> Ãerflieger
> Unter
> Zetrix
>
> this is also wrong (There should be 'Unter' and then 'U:berflieger'
> [Überflieger]). so is this a libc bug?

The sort order is correct, so libc did succeed in its part. Maybe your
terminal is having issues with utf-8? If you're using xterm: Did you
run it with -u8 or some utf-8-enabling X-resource? To verify that the
terminal is working properly, typing

echo ö > /tmp/foo
file /tmp/foo

on a shell should tell you that you have a utf-8 text file.

HTH
Andreas
--

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Andreas Seltenreich 2005-09-02 09:56:49 Re: Sorting Problem in UNICODE/german
Previous Message Klaus Ita 2005-09-02 06:36:41 Re: Sorting Problem in UNICODE/german