Re: Proper Unicode support

From: Hannu Krosing <hannu(at)tm(dot)ee>
To: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Alexey Mahotkin <alexm(at)hsys(dot)msk(dot)ru>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proper Unicode support
Date: 2003-08-12 22:18:00
Message-ID: 1060726680.2318.40.camel@fuji.krosing.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Oleg Bartunov kirjutas E, 11.08.2003 kell 11:52:
> On Mon, 11 Aug 2003, Peter Eisentraut wrote:
>
> > Alexey Mahotkin writes:
> >
> > > AFAIK, currently the codepoints are sorted in their numerical order. I've
> > > searched the source code and could not find the actual place where this is
> > > done. I've seen executor/nodeSort.c and utils/tuplesort.c. AFAIU, they
> > > are generic sorting routines.
> >
> > PostgreSQL uses the operating system's locale routines for this. So the
> > sort order depends on choosing a locale that can deal with Unicode.
> >
>
> sort order works, but upper/lower are broken.

I think that the original MB/Unicode support was made for japanese
language/characters, and AFAIK they don't even have the concept
(problem) of upper/lower case.

A question to the core - are there any plans to rectify this for less
fortunate languages/charsets?

Will the ASCII-speaking core tolerate the potential loss of performance
from locale-aware upper/lower ?

Will this be considered a feature or a bugfix (i.e. should we attempt to
fix it for 7.4) ?

---------------
Hannu

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Hannu Krosing 2003-08-12 22:25:17 Re: TODO items
Previous Message Joe Conway 2003-08-12 22:17:08 Re: Parsing speed (was Re: pgstats_initstats() cost)