Re: MINUS SIGN (U+2212) in EUC-JP encoding is mapped to FULLWIDTH HYPHEN-MINUS (U+FF0D) in UTF-8

From: Tatsuo Ishii <ishii(at)sraoss(dot)co(dot)jp>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: MINUS SIGN (U+2212) in EUC-JP encoding is mapped to FULLWIDTH HYPHEN-MINUS (U+FF0D) in UTF-8
Date: 2020-10-30 04:17:08
Message-ID: 20201030.131708.1285436931428714105.t-ishii@sraoss.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> The mapping is generated from CP932.TXT and JIS0212.TXT by
> UCS_to_UEC_JP.pl.

I still don't understand why this change has been made. Originally the
conversion was based on JIS0208.txt, JIS0212.txt and JIS0201.txt,
which is the exact definition of EUC-JP. CP932.txt is defined by
Microsoft for their products.

Probably we should call our "EUC-JP" something like "EUC-JP-MS" or
whatever to differentiate from true EUC-JP.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2020-10-30 04:19:01 empty table blocks hash antijoin
Previous Message Tatsuo Ishii 2020-10-30 04:06:26 Re: MINUS SIGN (U+2212) in EUC-JP encoding is mapped to FULLWIDTH HYPHEN-MINUS (U+FF0D) in UTF-8