| From: | "J(dot) Michael Crawford" <jmichael(at)gwi(dot)net> | 
|---|---|
| To: | Kris Jurka <books(at)ejurka(dot)com> | 
| Cc: | pgsql-general(at)postgresql(dot)org, pgsql-jdbc(at)postgresql(dot)org | 
| Subject: | Re: [JDBC] Using Postgres with Latin1 (ISO8859-1) | 
| Date: | 2004-11-08 21:12:45 | 
| Message-ID: | 6.1.2.0.2.20041108160948.02fd4638@pop.suscom-maine.net | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-general pgsql-jdbc | 
<<This is bad advice for a Java client and does not work.>>
   Well then, perhaps we shouldn't share the procedure with other folks.  I 
apologize if I'm introducing some misinformation.
   However, this has been the only way to get our system to work on more 
than one JVM.  People from this group provided many suggestions, people 
from other groups did the same, and nothing helped.  Taking bytes and 
translating encodings (examples follow my signature below) had no 
effect.  Changing the url of the postgres connection to include an encoding 
also had no effect.  Setting the encoding for the entire JVM didn't work 
either.  Either the data worked in a Linux VM, or a Windows VM, but not both.
   So, if you're going to correct us for the wrong solution (which I'm glad 
you have done), do you have any suggestions as to what the right solution 
might be?
- Mike
Encoding translations that didn't work:
a) Getting encoded bytes from the result set.  We tried the following block 
five times, once for each different encoding we were trying to test with 
the database:
dataRead = new String(result.getBytes(longName),"utf-8");
dataLatin_a = new String(dataRead.getBytes("ISO-8859-1"));
dataLatin_b = new String(dataRead.getBytes("Latin1"));
dataUnicode_a = new String(dataRead.getBytes("utf-8"));
dataUnicode_b = new String(dataRead.getBytes("UTF8"));
dataWin = new String(dataRead.getBytes("Cp1252"));
b)  Getting a string, turning it bytes, and then translating.  Same process 
as above, but we use result.getString...
   No matter  what, strings showed up as gibberish in one JVM or another, 
depending upon the native encoding of the database.  A Latin1 database 
worked in the windows JVM, a Unicode in the Linux JVM, but not the other 
way around.
At 12:15 PM 11/8/2004, Kris Jurka wrote:
 >
 >
 >On Mon, 8 Nov 2004, J. Michael Crawford wrote:
 >>
 >>    Even in Java, where you can do all sorts of character-encoding
 >> translation, it can be impossible to translate data retrieved from 
Postgres
 >> if it's in the wrong encoding.  We've tried changing the JVM encoding,
 >> altering the jdbc driver, translating encodings on the database read, and
 >> translating encodings after the read while building a new string, to no
 >> avail.  We tried 25 combinations of each strategy (five different possible
 >> read encodings and five different possible string encodings), and nothing
 >> worked.  We could get an application working in one JVM with one encoding,
 >> but another JVM would break, and no amount of translation would help.
 >>
 >>    But when we finally told Postgres what to return, everythign worked 
like
 >> a charm.
 >>
 >>    Just as with step two, the key is to use the "SET CLIENT_ENCODING TO
 >> (encoding)" sql command.  If you're using an application where you can 
send
 >> SQL to the server, this is all you need.  In something like MS Access,
 >> you'll have to move to a passthrough query.  For Java, you'll need to send
 >> a command through JDBC:
 >>
 >> String DBEncoding = "Unicode"  //use a real encoding, either returned from
 >> the jvm or explicitly stated
 >> PreparedStatement statement = dbCon.prepareStatement("SET CLIENT_ENCODING
 >> TO '" + DBEncoding + "'");
 >> statement.execute();
 >>
 >
 >This is bad advice for a Java client and does not work.  The JDBC driver
 >always expects data in unicode and issues a SET client_encoding of it's
 >own at connection startup to make sure it gets unicode data.  Changing
 >this to another encoding will break the driver and in the cvs version a
 >check has been added to error out if it detects you doing this.
 >
 >Kris Jurka
 >
 >---------------------------(end of broadcast)---------------------------
 >TIP 4: Don't 'kill -9' the postmaster
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Andrew - Supernews | 2004-11-08 21:34:19 | Re: RFD: comp.databases.postgresql.general | 
| Previous Message | Mike Cox | 2004-11-08 20:55:06 | Re: Postresql RFD version 2.0 Help Wanted. | 
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Oliver Jowett | 2004-11-08 21:55:55 | Re: [JDBC] Using Postgres with Latin1 (ISO8859-1) | 
| Previous Message | Kris Jurka | 2004-11-08 17:15:19 | Re: [JDBC] Using Postgres with Latin1 (ISO8859-1) and Unicode (utf-8) |