Re: ORDER BY and Unicode

From: Stephan Szabo <sszabo(at)megazone(dot)bigpanda(dot)com>
To: "M(dot) Bastin" <marcbastin(at)mindspring(dot)com>
Cc: pgsql-novice(at)postgresql(dot)org
Subject: Re: ORDER BY and Unicode
Date: 2004-05-12 13:51:08
Message-ID: 20040512064519.A73325@megazone.bigpanda.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-novice

On Wed, 12 May 2004, M. Bastin wrote:

> There seems to be a big problem with Unicode for
> which a solution might already exist. Somebody
> had the following problem on another mailing
> list. My suggestion is at the bottom of this
> message but if another solution already exists
> I'd like to hear about it.
>
> The problem is that special characters aren't
> treated right under Unicode. Here are a few
> examples:
>
> 1. "UPPER('')" doesn't work.

IIRC, right now upper and lower only work correctly in
single byte encodings. I think when full sql collation
and character set behavior is done this problem will
go away.

> 2. "ORDER BY mycolumn" gives a wrong sort order.
>
> Uppercase ASCII characters come first, then
> lowercase ASCII, then accented characters...
> This really isn't what a human would like to see.

This is driven by locale, what LC_COLLATE value
was the database created with (if you don't know then
pg_controldata should give that to you)?

It sounds like the locale is "C" locale which means
sort by byte value or perhaps the locale is one that isn't
for the correct encoding.

In response to

Responses

Browse pgsql-novice by date

  From Date Subject
Next Message Ian Pilcher 2004-05-12 14:51:07 Re: Darn pop singers!
Previous Message M. Bastin 2004-05-12 12:39:58 ORDER BY and Unicode