Re: Multi-byte character bug

From: Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>
To: richso(at)i-cable(dot)com
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: Multi-byte character bug
Date: 2002-08-01 03:41:43
Message-ID: 20020801.124143.55850484.t-ishii@sra.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

> > There's no character code in EUC_TW (CNS 11643-1992)
> > corresponding to Big5 0xc05c. That's why PostgreSQL complains.
>
>
> But I've created another db using MULE_INTERNAL encoding, the same error
> reported, why ?

Since Big5 representation of MULE_INTERNAL is actually "leading
character"+EUC_TW. i.e.

> Why don't Postgres directly support BIG5 in server side

It's because of pury technical reason. Handling those encodings
containing bytes < 0x80 in second (or third) byte of a word confuses
our SQL parser. I think it's not impossible for the parser to handle
Big5, but if we make such a change, the parser would not be able to
other encodings. If you have a good idea to overcome these problems,
we are wellcome.

> as BIG5 is the
> main encoding using for Traditional Chinese communities, i.e. HK &
> Taiwan ? As EUC_TW do not have complete correspondings char in BIG5,
> this will seriously prevent the Traditional Chinese communities for
> using Postgresql !

Just a curious. Why do people living in those area prefer Big5 over
EUC_TW? I thought EUC_TW (or CNS 11643-1992) was defined by the
goverment in Taiwan. Is there any technical superiority in Big5?
Or maybe "don't know why but just many peole use Big5":-)
--
Tatsuo Ishii

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Hubert Pérès 2002-08-01 06:29:41 Re: share lib libpq++
Previous Message pgsql-bugs 2002-07-31 20:41:54 Bug #727: Can I get a PGresult * ptr. back from an epcg interface.