Re: Multibyte (Japanese Character) Sorting

From: Tatsuo Ishii <ishii(at)postgresql(dot)org>
To: mgonzales(at)tspi(dot)com(dot)ph
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Multibyte (Japanese Character) Sorting
Date: 2008-04-30 12:26:06
Message-ID: 20080430.212606.21915336.t-ishii@sraoss.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

I have taken a look at the screen shot. Yes, the sort order seems
pretty ridiculous. I tested similar data on my Linux box and the
result was nothing strange. Do you have an index on the field? What is
the platform PostgreSQL is running on? Do you see the same problem
using psql? Can you give me the pg_dump data if possible?
--
Tatsuo Ishii
SRA OSS, Inc. Japan

> Thank you for your reply. But I believe our LOCALE was already set to C
> (since this is the default setting).
>
> I've attached the result of my query using "ORDER BY <field> ASC". This
> field contains double byte character for both english and japanese text.
> I think the problem with this sorting is, it sorts by length then by
> ascii code value.
>
> Tatsuo Ishii wrote:
> >> Hi there,
> >>
> >> Im having a problem in sorting multibyte characters.
> >>
> >> I am using EUC-JP for my database encoding becuase we need to support
> >> japanese (hiragana, katakana, kanji) text, since our clients are japanese.
> >>
> >> I have a table named "user_info" with the following fields:
> >>
> >> first_name character(60) NOT NULL
> >> last_name character(60) NOT NULL
> >>
> >> We've forced doublebyte character our entries so that all data stored in
> >> the table are doublebyte. The problem is, the sorting procedure. when
> >> you user ORDER BY last_name ASC, the list is not sorted properly. Please
> >> help me fix this problem. Thank you in advanced.
> >
> > I'm not sure why you think "not sorted properly", but my wild guess is
> > your OS's locale data is broken. Use C locale.
> > --
> > Tatsuo Ishii
> > SRA OSS, Inc. Japan
> >
> >
>
> --
> ==================================================
> Morgan Gonzales - 1st BU (MSI) - Tsukiden Software
>
> There are two kinds of people in this world.
> One says to God, thy will be done,
> and the other to whom God says, thy will be done.

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Andy Anderson 2008-04-30 13:10:38 Quoting "
Previous Message Peter Geoghegan 2008-04-30 12:19:00 Performing a sub-query in a SELECT SUM aggregate.