Re: ERROR: character 0xe3809c of encoding "UTF8" has no equivalent in EUC_JP

From: Tatsuo Ishii <ishii(at)postgresql(dot)org>
To: ktuszynska(at)esri(dot)com
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: ERROR: character 0xe3809c of encoding "UTF8" has no equivalent in EUC_JP
Date: 2011-03-23 01:58:32
Message-ID: 20110323.105832.497527362714561205.t-ishii@sraoss.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

> Hi,
> I was wondering if this was considered a bug, and if so what were the plans to fix it: http://archives.postgresql.org/pgsql-bugs/2005-08/msg00211.php
>
> I searched the: pgsql-bug archive and found nothing
> I also searched the wiki to do list and found nothing
> But I could have missed it.

I don't consider it's a bug.

We maps "WAVE DASH" of EUC-JP (0xa1c1) to U+FF5E, not U+301C. U+FF5E
and U+301C look same, but there are different code point by some
reason I don't know. On the other hand EUC-JP has only one code point
for WAVE DASH. So if we want to do a round trip conversion between
EUC-JP and UTF-8, we have to choose either U+FF5E OR U+301C. We have
chosen U+FF5E. If we change the mapping, many existing applications
would break.

Same thing can be said to MINUS sign.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Itagaki Takahiro 2011-03-23 01:58:48 Re: ERROR: character 0xe3809c of encoding "UTF8" has no equivalent in EUC_JP
Previous Message Kasia Tuszynska 2011-03-22 23:05:41 ERROR: character 0xe3809c of encoding "UTF8" has no equivalent in EUC_JP