Re: [JDBC] Using Postgres with Latin1 (ISO8859-1) and Unicode (utf-8)

From: Kris Jurka <books(at)ejurka(dot)com>
To: "J(dot) Michael Crawford" <jmichael(at)gwi(dot)net>
Cc: pgsql-general(at)postgresql(dot)org, pgsql-jdbc(at)postgresql(dot)org
Subject: Re: [JDBC] Using Postgres with Latin1 (ISO8859-1) and Unicode (utf-8)
Date: 2004-11-08 17:15:19
Message-ID: Pine.BSO.4.56.0411081210050.23287@leary.csoft.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-jdbc

On Mon, 8 Nov 2004, J. Michael Crawford wrote:
>
> Even in Java, where you can do all sorts of character-encoding
> translation, it can be impossible to translate data retrieved from Postgres
> if it's in the wrong encoding. We've tried changing the JVM encoding,
> altering the jdbc driver, translating encodings on the database read, and
> translating encodings after the read while building a new string, to no
> avail. We tried 25 combinations of each strategy (five different possible
> read encodings and five different possible string encodings), and nothing
> worked. We could get an application working in one JVM with one encoding,
> but another JVM would break, and no amount of translation would help.
>
> But when we finally told Postgres what to return, everythign worked like
> a charm.
>
> Just as with step two, the key is to use the "SET CLIENT_ENCODING TO
> (encoding)" sql command. If you're using an application where you can send
> SQL to the server, this is all you need. In something like MS Access,
> you'll have to move to a passthrough query. For Java, you'll need to send
> a command through JDBC:
>
> String DBEncoding = "Unicode" //use a real encoding, either returned from
> the jvm or explicitly stated
> PreparedStatement statement = dbCon.prepareStatement("SET CLIENT_ENCODING
> TO '" + DBEncoding + "'");
> statement.execute();
>

This is bad advice for a Java client and does not work. The JDBC driver
always expects data in unicode and issues a SET client_encoding of it's
own at connection startup to make sure it gets unicode data. Changing
this to another encoding will break the driver and in the cvs version a
check has been added to error out if it detects you doing this.

Kris Jurka

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Woodchuck Bill 2004-11-08 17:28:57 Re: RFD: comp.databases.postgresql.general
Previous Message Marc G. Fournier 2004-11-08 16:51:50 Re: RFD: comp.databases.postgresql.general

Browse pgsql-jdbc by date

  From Date Subject
Next Message J. Michael Crawford 2004-11-08 21:12:45 Re: [JDBC] Using Postgres with Latin1 (ISO8859-1)
Previous Message J. Michael Crawford 2004-11-08 16:07:34 Using Postgres with Latin1 (ISO8859-1) and Unicode (utf-8) character sets.