Re: JDBC to load UTF8@psql to latin1@mysql

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: emilu(at)encs(dot)concordia(dot)ca
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: JDBC to load UTF8@psql to latin1@mysql
Date: 2012-12-14 15:15:59
Message-ID: 29835.1355498159@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Emi Lu <emilu(at)encs(dot)concordia(dot)ca> writes:
> For now, through the following method, all letters are correctly
> transformed except "".

Meh. That character renders as \310 in your mail, which is not an
assigned code in ISO 8859-1. The numerically corresponding Unicode
value would be U+0090, which is an unspecified control character.

I surmise that your source data is not actually either Unicode or
ISO 8859-1, but one of the random "extended" character sets that
Microsoft has loosed upon the world, perhaps windows-1252
http://en.wikipedia.org/wiki/Windows-1252

The conversion code that you're using is quite right to reject the
character as not being valid LATIN1. What you need to do is figure out
what the data actually is and correct its encoding. It's evidently
stored wrong in the UTF8 data, if you believe that this code is a
letter.

regards, tom lane

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message joshua 2012-12-14 15:16:35 Implicit casts to array types
Previous Message Emi Lu 2012-12-14 14:56:05 Re: JDBC to load UTF8@psql to latin1@mysql