Re: sorting chinese characters

From: Ian Barwick <barwick(at)gmx(dot)net>
To: "prabahar" <prabahar(at)questech(dot)co(dot)in>, pgsql-sql(at)postgresql(dot)org
Subject: Re: sorting chinese characters
Date: 2003-04-25 18:10:11
Message-ID: 200304252010.11373.barwick@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

On Friday 25 April 2003 10:22, prabahar wrote:
> Hi, I have a requirement where I have to sort a field which has euc-jp
> characters in it. When i sort them we find that Japanese Hiragana
> Characters are sorted properly. But Chinese characters are not sorted
> properly.

Can you define "properly"? What is it you want to sort?

> Can any one give some sujestions how to fix it? I have set the
> LC_ALL=ja_JP in the profile.

Unfortunately with Japanese "Chinese" characters there is no algorithmically
determinable sort order You will need some kind of lookup table containing
hiragana (and possibly katakana) if you want to sort in phonetic dictionary
order as there is a "many to many" relationship between characters /
combinations of characters and their pronuncation(s).

If the data you are dealing with represents names you don't have a chance
unless the data comes with the pronunciation in a seperate field (which
is why Japanese forms usually have space for both characters and
pronuncation).

It should be possible using a lookup table to determine sort order of a given
set of characters based on their structure (radical / stroke count), but this
method of sorting is archaic and generally not used.

Ian Barwick
barwick(at)gmx(dot)net

In response to

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message Vernon 2003-04-25 18:44:40 Re: Fwd: Unicode, RedHat Linux, & PostgreSQL
Previous Message Dennis Gearon 2003-04-25 17:47:35 Re: [SQL] rewriting values with before trigger