Re: MINUS SIGN (U+2212) in EUC-JP encoding is mapped to FULLWIDTH HYPHEN-MINUS (U+FF0D) in UTF-8

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: ishii(at)sraoss(dot)co(dot)jp
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: MINUS SIGN (U+2212) in EUC-JP encoding is mapped to FULLWIDTH HYPHEN-MINUS (U+FF0D) in UTF-8
Date: 2020-10-30 04:47:55
Message-ID: 20201030.134755.1051382563271744187.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At Fri, 30 Oct 2020 13:17:08 +0900 (JST), Tatsuo Ishii <ishii(at)sraoss(dot)co(dot)jp> wrote in
> > The mapping is generated from CP932.TXT and JIS0212.TXT by
> > UCS_to_UEC_JP.pl.
>
> I still don't understand why this change has been made. Originally the
> conversion was based on JIS0208.txt, JIS0212.txt and JIS0201.txt,
> which is the exact definition of EUC-JP. CP932.txt is defined by
> Microsoft for their products.
>
> Probably we should call our "EUC-JP" something like "EUC-JP-MS" or
> whatever to differentiate from true EUC-JP.

Seems valid. Things are already so at the time aeed17d is introduced
(I believe it didn't make any difference in conversions.) and the
change was made by a8bd7e1c6e in 2002.

I'm not sure the point of the change, though..

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Justin Pryzby 2020-10-30 04:51:38 Re: should INSERT SELECT use a BulkInsertState?
Previous Message Ashutosh Sharma 2020-10-30 04:34:22 Re: MINUS SIGN (U+2212) in EUC-JP encoding is mapped to FULLWIDTH HYPHEN-MINUS (U+FF0D) in UTF-8