Re: UTF8

From: Oliver Jowett <oliver(at)opencloud(dot)com>
To: Markus Schaber <schabi(at)logix-tt(dot)com>
Cc: Bakos Sandor <dr_saca(at)freemail(dot)hu>, pgsql-jdbc(at)postgresql(dot)org
Subject: Re: UTF8
Date: 2006-06-02 08:19:08
Message-ID: 447FF47C.5000901@opencloud.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-jdbc

Markus Schaber wrote:
> Hi, Bakos,
>
> Bakos Sandor wrote:
>
>
>>I get the following exception when I read a simple TXT file in Linux and
>>try to INSERT to the psql. (8.1.4)
>>
>>org.postgresql.util.PSQLException: ERROR: character 0xefbfbd of encoding
>>"UTF8" has no equivalent in "LATIN2"
>
>
> This meas that your database is encoded in ISO-LATIN2 charset, and psql
> is telling the server the data it sends is UTF-8. The server tries to
> convert the UTF-8 Data into LATIN2, but there is a character (whose
> UTF8-Sequence is 0xefbfbd) that is not contained in LATIN-2.
>
> Either your file is latin-2 in reality (or even another charset), then
> you should tell psql to use the latin-2 encoding.
>
> Or your file really is utf-8, and really contains characters not
> contained in latin-2. Then you have two possibilities: Edit the file and
> replace those characters with some transcription, or convert your
> database to utf-8 encoding (needs a dump&restore).

Actually, given that that's a Java JDBC exception, there's no 'psql'
client involved at all.

The JDBC driver always uses UTF8 as the client encoding since that maps
easily from the native Java string representation (UCS2) and every
possible Java String can be represented in UTF8. Of course, not every
possible Java string can be represented as LATIN2, which is the cause of
the error.

I would guess that the problem is probably that when *reading* the text
file originally, the wrong encoding is being used to convert the bytes
to Java Strings. If you don't use the right encoding here, then the Java
String you end up with will be garbage.

-O

In response to

  • Re: UTF8 at 2006-06-02 07:57:08 from Markus Schaber

Responses

  • Re: UTF8 at 2006-06-02 09:09:42 from Markus Schaber
  • Re: UTF8 at 2006-06-02 09:23:39 from Marc Herbert

Browse pgsql-jdbc by date

  From Date Subject
Next Message Markus Schaber 2006-06-02 09:09:42 Re: UTF8
Previous Message Markus Schaber 2006-06-02 07:57:08 Re: UTF8