Skip site navigation (1) Skip section navigation (2)

Re: UTF8

From: Oliver Jowett <oliver(at)opencloud(dot)com>
To: Markus Schaber <schabi(at)logix-tt(dot)com>
Cc: Bakos Sandor <dr_saca(at)freemail(dot)hu>, pgsql-jdbc(at)postgresql(dot)org
Subject: Re: UTF8
Date: 2006-06-02 08:19:08
Message-ID: (view raw, whole thread or download thread mbox)
Lists: pgsql-jdbc
Markus Schaber wrote:
> Hi, Bakos,
> Bakos Sandor wrote:
>>I get the following exception when I read a simple TXT file in Linux and
>>try to INSERT to the psql. (8.1.4)
>>org.postgresql.util.PSQLException: ERROR: character 0xefbfbd of encoding
>>"UTF8" has no equivalent in "LATIN2"
> This meas that your database is encoded in ISO-LATIN2 charset, and psql
> is telling the server the data it sends is UTF-8. The server tries to
> convert the UTF-8 Data into LATIN2, but there is a character (whose
> UTF8-Sequence is 0xefbfbd) that is not contained in LATIN-2.
> Either your file is latin-2 in reality (or even another charset), then
> you should tell psql to use the latin-2 encoding.
> Or your file really is utf-8, and really contains characters not
> contained in latin-2. Then you have two possibilities: Edit the file and
> replace those characters with some transcription, or convert your
> database to utf-8 encoding (needs a dump&restore).

Actually, given that that's a Java JDBC exception, there's no 'psql' 
client involved at all.

The JDBC driver always uses UTF8 as the client encoding since that maps 
easily from the native Java string representation (UCS2) and every 
possible Java String can be represented in UTF8. Of course, not every 
possible Java string can be represented as LATIN2, which is the cause of 
the error.

I would guess that the problem is probably that when *reading* the text 
file originally, the wrong encoding is being used to convert the bytes 
to Java Strings. If you don't use the right encoding here, then the Java 
String you end up with will be garbage.


In response to

  • Re: UTF8 at 2006-06-02 07:57:08 from Markus Schaber


  • Re: UTF8 at 2006-06-02 09:09:42 from Markus Schaber
  • Re: UTF8 at 2006-06-02 09:23:39 from Marc Herbert

pgsql-jdbc by date

Next:From: Markus SchaberDate: 2006-06-02 09:09:42
Subject: Re: UTF8
Previous:From: Markus SchaberDate: 2006-06-02 07:57:08
Subject: Re: UTF8

Privacy Policy | About PostgreSQL
Copyright © 1996-2018 The PostgreSQL Global Development Group