Re: Encoding weirdness with JDBC, driver crashing?

From: Barry Lind <barry(at)xythos(dot)com>
To: Nikola Milutinovic <Nikola(dot)Milutinovic(at)ev(dot)co(dot)yu>
Cc: pgsql-jdbc(at)postgresql(dot)org
Subject: Re: Encoding weirdness with JDBC, driver crashing?
Date: 2001-12-04 03:51:16
Message-ID: 3C0C4834.60009@xythos.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-jdbc

Nikola,

Thank you for the test case. I have been able to reproduce the problem
with your test code and discovered that the problem appears to be a bug
in your code, not in the jdbc driver.

In your test program you had the following code:

System.out.println ("ID: " + id + " NAME: " + name);

This use of println is going to convert the string to be printed into
the default character encoding for the JVM, which is probably not what
you want (and gives me the wrong result). However if you explicitly
convert to the desired character set (LATIN2 in this case) then the
output is correct. So I changed your test program and replaced the line
above with:

String l_mesg = "ID: " + id + " NAME: " + name;
//output using the default jvm encoding
System.out.println (l_mesg);
try {
//output using an explicit encoding
System.out.write( l_mesg.getBytes("LATIN2") );
System.out.println();
} catch (Exception l_e) {
System.out.println("error : "+l_e.toString());
System.exit(1);
}

Below is the output of my testing showing that the jdbc driver returns
the same values as psql does (when using the correct charset to print
out the results). I performed this test with both the 7.1 driver and
the 7.2 driver. Therefore I don't see any bug with the jdbc driver.

thanks,
--Barry

[blind(at)barry pgsql]$ createdb -E LATIN2 test
CREATE DATABASE
[blind(at)barry pgsql]$ psql test
Welcome to psql, the PostgreSQL interactive terminal.

Type: \copyright for distribution terms
\h for help with SQL commands
\? for help on internal slash commands
\g or terminate with semicolon to execute query
\q to quit

test=# \l
List of databases
Name | Owner | Encoding
-----------+-------+-----------
template0 | blind | SQL_ASCII
template1 | blind | SQL_ASCII
test | blind | LATIN2
(3 rows)

test=# create table test_temp (id integer, test_text text);
CREATE
test=# insert into test_temp values( 1, 'Milutinovi\346 \251 \271 \306
\346 \310 \350 \320 \360 \256 \276');
INSERT 126055 1
test=# select * from test_temp;
id | test_text
----+---------------------------------
1 | Milutinoviæ © ¹ Æ æ È è Ð ð ® ¾
(1 row)

test=# \q
[blind(at)barry pgsql]$ cd ~/work
[blind(at)barry work]$ java test3psql
Connecting with: jdbc:postgresql://localhost/test
ID: 1 NAME: Milutinovi? ? ? ? ? ? ? ? ? ? ?
ID: 1 NAME: Milutinoviæ © ¹ Æ æ È è Ð ð ® ¾
[blind(at)barry work]$

Nikola Milutinovic wrote:

> Barry Lind wrote:
>
>> Nikola,
>>
>> You shouldn't need to pass any encoding and you certainly shouldn't
>> pass an encoding that is different than the database encoding.
>>
>> If the database is created with LATIN2 (verify by 'select
>> getdatabaseencoding()' from psql) then the jdbc driver will
>> automatically convert from/to the database encoding to/from the
>> unicode internal representation java uses.
>>
>> If you explicitly set an encoding different than the database is using
>> you will likely have problems since then the driver will convert
>> from/to this encoding instead of the encoding the database is using.
>>
>> If you are still having problems, please post a simple test case that
>> I can run locally to reproduce your problem
>
>
> Hi Barry.
>
> After some time, I have decided to clean-up the mess of my DB and I
> created TWO additional databases. First, I wanted to convert all my data
> from WIN1250 (inside a LATIN1 database, which is obviously an invitation
> for the Devil...) and to get to a new data structure (I had a company's
> e-mail address book inside and I wanted to switch from a fixed 3-layer
> hierarchy to a regular n-tree hierarchy). While I was at it, I thought I
> could switch from LATIN2 to UNICODE.
>
> So, I have two databases: "import" with LATIN2 encoding and "www" with
> UNICODE encoding. I first exported the old database, using:
>
> pg_dump -C -d -f mercury_dump.sql mercury
>
> Then I editted the resulting file, namely, I added "\encoding WIN1250"
> to it, so the data on the input would be treated for what it was - a
> WIN1250 encoded data.
>
> Then I fed that to "import" database. After that I wrote a small Java
> application which took data from "import" and transformed it to "www". I
> thought it would convert the strings from LATIN2 to UNICODE, along the way.
>
> I think all of the conversions went OK. If I could get pg_dump to dump
> extended ASCII as octal values, I could verify for sure (know how I
> could do it?).
>
> THE PROBLEM
> -----------
>
> Of course, the problem is JDBC. Again, I see those dreaded "?" instead
> of our alphabet characters. So, I took the old test Java application and
> fired it up, just to see what is going on. Just to make myself perfectly
> clear, the letters I'm interested in are: c-acsan, c-caron, s-caron,
> z-caron and d-slash
>
> This is the result:
>
> Database "import" LATIN2 encoding
> ---------------------------------
>
> charSet=(unspecified): I get "?" for all of our letters
> charSet=LATIN1: I get what appears to be LATIN2 encoded data
> charSet=LATIN2: I get "?"
> charSet=UNICODE: I get SQLException.
>
> Database "www" UNICODE encoding
> -------------------------------
>
> charSet=(unspecified): I get "?" for all of our letters
> charSet=LATIN1: I get what appears to be depricated UNICODE encoded data
> charSet=LATIN2: I get what looks like depricated UNICODE, but
> slightly different from the former case.
> charSet=UNICODE: I get SQLException.
>
> THE TEST CASE
> -------------
>
> You asked me if I could provide a test example, well here goes.
>
> 1. create a database with LATIN2 encoding
> 2. create a table
> CREATE TABLE (
> id int4,
> test_text text
> );
> 3. insert some data into it.
> insert into test_temp values( 1, 'Milutinovi\346 \251 \271 \306 \346
> \310 \350 \320 \360 \256 \276');
>
> 4. try to fetch this textual field. I'm attaching a Java application I
> used for testing.
>
> ADDITIONAL
> ----------
>
> Should it appear to work, could you send me the JDBC driver you're
> using, so I could test it on my side?
>
> Nix.
>
>
> ------------------------------------------------------------------------
>
> import java.sql.*;
>
> public class test3psql {
> // static public final String CONN_URL = "jdbc:postgresql://Mercury.ev.co.yu/www";
> static public final String CONN_URL = "jdbc:postgresql://Mercury.ev.co.yu/import";
> static public final String user = "www";
> static public final String pass = "test00";
>
> static private Connection conn;
> static private String connString;
> static private Statement st;
> static private ResultSet rs;
> static private int id=1;
>
> static public void main( String args[] )
> throws SQLException, ClassNotFoundException {
> conn=null;
> Class.forName( "org.postgresql.Driver" );
> connString = CONN_URL;
> if( args.length == 1 ) {
> connString += "?charSet=" + args[0];
> }
> System.out.println( "Connecting with: " + connString );
> conn = DriverManager.getConnection( connString, user, pass );
> if( conn != null ) {
> st = conn.createStatement();
> // rs = st.executeQuery( "SELECT * FROM e_user WHERE surname LIKE 'Ivkovi%'" );
> rs = st.executeQuery( "SELECT * FROM test_temp" );
> while( rs.next() ) {
> //String name = rs.getString( "name" );
> //String surn = rs.getString( "surname" );
> String name = rs.getString( "tekst" );
> String surn = "EMPTY";
> int id = rs.getInt( "id" );
>
> System.out.println ( "ID: " + id + " NAME: " + name + " SURNAME: " + surn );
> }
> rs.close();
> st.close();
> conn.close();
> }
> }
> }
>
>
> ------------------------------------------------------------------------
>
> java -classpath /usr/local/pgsql/share/java/postgresql.jar:. test3psql $1
>

In response to

Responses

Browse pgsql-jdbc by date

  From Date Subject
Next Message Laszlo Hornyak 2001-12-04 09:12:30 Re: java stored procedures
Previous Message Dave Cramer 2001-12-04 01:17:26 Re: Exception?