Re: sorting Chinese varchar field

From: Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>
To: chenjian1227(at)gmail(dot)com
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: sorting Chinese varchar field
Date: 2005-03-29 07:38:50
Message-ID: 20050329.163850.48519443.t-ishii@sra.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

> Great, that works out fine!
>
> So, the SQL I tested with is:
> select * from mytable order by convert(name, 'utf8', 'gb18030');

Sorry, what I wanted to say was:

SELECT * FROM t1 WHERE ... ORDER BY CONVERT(your_chinese_character using utf_8_to_gb_18030);

Of course your example is fine too (actually they are internally
identical).
--
Tatsuo Ishii

> It produces the correct output.
>
> Thanks Tatsuo!
>
> Jian
>
> On Tue, 29 Mar 2005 10:25:58 +0900 (JST), Tatsuo Ishii
> <t-ishii(at)sra(dot)co(dot)jp> wrote:
> > > Hi,
> > >
> > > I installed postgres 8.0 for windows on my win xp (Simplified Chinese
> > > version). The encoding is unicode. When I set pgsql client encoding to
> > > gb18030, I could insert Chinese text from the command line to
> > > postgres.
> > >
> > > However, I could not get the sort order of Chinese varchar field to
> > > work properly.
> > >
> > > What I tried are as follows:
> > >
> > > 1) installed postgres for windows and used the "C" locale.
> > > 2) installed postgres for windows and used the "Chinese, PRC" locale.
> > >
> > > Again, in both cases, the backend encoding is unicode.
> > >
> > > The other interesting thing I observed was, when setting to the "C"
> > > locale, the following sql worked fine:
> > > select * from user where name = 'xxxxx';
> > > xxxxx is a Chinese text
> > > However, if I set the locale to "Chinese, PRC" during installation,
> > > the above select did not get the any matching rows, where it should
> > > have got. In this case, the following worked fine:
> > > select * from user where name ilike 'xxxxx%';
> > >
> > > Could anyone let me know the best practice for using postgres to store
> > > Chinese text? (This should be the same problem I guess, for using
> > > postgres to store other languages than English.)
> >
> > I would suggest to use UNICODE/C locale combo. On most systems the
> > locale database for multibyte encodings are broken as far as I
> > know. For the sorting problem, probably you could get the right sort
> > order by using convert. i.e.
> >
> > SELECT * FROM t1 WHERE ... ORDER BY CONVERT(your_chinese_character, utf_8_to_gb_18030);
> >
> > If above does not work, you cannot get the right sort order even if
> > you use GB18030 anyway.
> > --
> > Tatsuo Ishii
> >
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
> http://archives.postgresql.org
>

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Robin Ericsson 2005-03-29 07:48:39 Re: LWM 2004 Readers' Choice Nomination
Previous Message Sim Zacks 2005-03-29 06:40:22 after gentoo emerge plpython failure