Re: [JDBC] Using Postgres with Latin1 (ISO8859-1)

From: "J(dot) Michael Crawford" <jmichael(at)gwi(dot)net>
To: Kris Jurka <books(at)ejurka(dot)com>
Cc: pgsql-general(at)postgresql(dot)org, pgsql-jdbc(at)postgresql(dot)org
Subject: Re: [JDBC] Using Postgres with Latin1 (ISO8859-1)
Date: 2004-11-08 21:12:45
Message-ID: 6.1.2.0.2.20041108160948.02fd4638@pop.suscom-maine.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-jdbc

<<This is bad advice for a Java client and does not work.>>

Well then, perhaps we shouldn't share the procedure with other folks. I
apologize if I'm introducing some misinformation.

However, this has been the only way to get our system to work on more
than one JVM. People from this group provided many suggestions, people
from other groups did the same, and nothing helped. Taking bytes and
translating encodings (examples follow my signature below) had no
effect. Changing the url of the postgres connection to include an encoding
also had no effect. Setting the encoding for the entire JVM didn't work
either. Either the data worked in a Linux VM, or a Windows VM, but not both.

So, if you're going to correct us for the wrong solution (which I'm glad
you have done), do you have any suggestions as to what the right solution
might be?

- Mike

Encoding translations that didn't work:

a) Getting encoded bytes from the result set. We tried the following block
five times, once for each different encoding we were trying to test with
the database:

dataRead = new String(result.getBytes(longName),"utf-8");
dataLatin_a = new String(dataRead.getBytes("ISO-8859-1"));
dataLatin_b = new String(dataRead.getBytes("Latin1"));
dataUnicode_a = new String(dataRead.getBytes("utf-8"));
dataUnicode_b = new String(dataRead.getBytes("UTF8"));
dataWin = new String(dataRead.getBytes("Cp1252"));

b) Getting a string, turning it bytes, and then translating. Same process
as above, but we use result.getString...

No matter what, strings showed up as gibberish in one JVM or another,
depending upon the native encoding of the database. A Latin1 database
worked in the windows JVM, a Unicode in the Linux JVM, but not the other
way around.

At 12:15 PM 11/8/2004, Kris Jurka wrote:
>
>
>On Mon, 8 Nov 2004, J. Michael Crawford wrote:
>>
>> Even in Java, where you can do all sorts of character-encoding
>> translation, it can be impossible to translate data retrieved from
Postgres
>> if it's in the wrong encoding. We've tried changing the JVM encoding,
>> altering the jdbc driver, translating encodings on the database read, and
>> translating encodings after the read while building a new string, to no
>> avail. We tried 25 combinations of each strategy (five different possible
>> read encodings and five different possible string encodings), and nothing
>> worked. We could get an application working in one JVM with one encoding,
>> but another JVM would break, and no amount of translation would help.
>>
>> But when we finally told Postgres what to return, everythign worked
like
>> a charm.
>>
>> Just as with step two, the key is to use the "SET CLIENT_ENCODING TO
>> (encoding)" sql command. If you're using an application where you can
send
>> SQL to the server, this is all you need. In something like MS Access,
>> you'll have to move to a passthrough query. For Java, you'll need to send
>> a command through JDBC:
>>
>> String DBEncoding = "Unicode" //use a real encoding, either returned from
>> the jvm or explicitly stated
>> PreparedStatement statement = dbCon.prepareStatement("SET CLIENT_ENCODING
>> TO '" + DBEncoding + "'");
>> statement.execute();
>>
>
>This is bad advice for a Java client and does not work. The JDBC driver
>always expects data in unicode and issues a SET client_encoding of it's
>own at connection startup to make sure it gets unicode data. Changing
>this to another encoding will break the driver and in the cvs version a
>check has been added to error out if it detects you doing this.
>
>Kris Jurka
>
>---------------------------(end of broadcast)---------------------------
>TIP 4: Don't 'kill -9' the postmaster

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Andrew - Supernews 2004-11-08 21:34:19 Re: RFD: comp.databases.postgresql.general
Previous Message Mike Cox 2004-11-08 20:55:06 Re: Postresql RFD version 2.0 Help Wanted.

Browse pgsql-jdbc by date

  From Date Subject
Next Message Oliver Jowett 2004-11-08 21:55:55 Re: [JDBC] Using Postgres with Latin1 (ISO8859-1)
Previous Message Kris Jurka 2004-11-08 17:15:19 Re: [JDBC] Using Postgres with Latin1 (ISO8859-1) and Unicode (utf-8)