Skip site navigation (1) Skip section navigation (2)

Encoding weirdness with JDBC, driver crashing?

From: Nikola Milutinovic <Nikola(dot)Milutinovic(at)ev(dot)co(dot)yu>
To: PostgreSQL JDBC <pgsql-jdbc(at)postgresql(dot)org>
Subject: Encoding weirdness with JDBC, driver crashing?
Date: 2001-11-03 15:13:19
Message-ID: 3BE4098F.7030303@ev.co.yu (view raw or flat)
Thread:
Lists: pgsql-jdbc
Hi all.

I'm having a weird episode with JDBC connection and charSet encoding.

OS: Digital UNIX 4.0D/F
DB: PostgreSQL 7.1.2 and 7.1.3

I have created a database with "-E LATIN2" option. Then I imported a WIN1250 
encoded data into it - the data was generated from a set of static HTML pages 
and loading was with WIN1250 client encoding.

The data looks OK from "psql", changing client encoding yields the expected 
result. I'm preety sure it is as it should be.

JDBC interface behaves in a very weird manner:

URL: jdbc:postgresql://localhost/mercury
OUT: all our alphabet specific characters are tuned into "?"

URL: jdbc:postgresql://localhost/mercury?charSet=LATIN1
OUT: I get data OK - LATIN2 encoded!!!

URL: jdbc:postgresql://localhost/mercury?charSet=LATIN2
OUT: all our alphabet specific characters are tuned into "?"

URL: jdbc:postgresql://localhost/mercury?charSet=UNICODE
OUT: JDBC connection crashes with:

Exception in thread "main" java.sql.SQLException:
  at org.postgresql.Connection.ExecSQL(Connection.java, Compiled Code)
  at org.postgresql.jdbc2.Statement.execute(Statement.java, Compiled Code)
  at org.postgresql.jdbc2.Statement.executeQuery(Statement.java, Compiled Code)
  at test2PostgreSQL.main(test2PostgreSQL.java, Compiled Code)

On the server side, PostgreSQL spits out:

ERROR:  parser: parse error at or near "t?"
FATAL 1:  Socket command type S unknown

(on my terminal, that "t?" looks really strange, two chars I cannot even 
describe, I guess Copy/Paste changed it to "t?")

So, anyone has an idea what is going on? I can live with "charSet=LATIN1" for 
the moment, but I have a nasty feeling, the data is not loaded as it should be. 
Namely, I'm not sure that, for instance, "c-acsan" letter Latin-2 encoded in 
PostgreSQL is really transformed into "c-acsan" Unicode encoded inside my Java 
application.

Since I'm more oriented to JSP for this matter, I'll live with it, but I have an 
uneasy feeling about it. I think this issue should be addressed.

PostgreSQL was built with:

--enable-locale              enable locale support
--enable-recode              enable character set recode support
--enable-multibyte           enable multibyte character support
--enable-unicode-conversion  enable unicode conversion support

TYIA,
Nix.


Responses

pgsql-jdbc by date

Next:From: Dave CramerDate: 2001-11-03 16:22:31
Subject: Backend Protocol
Previous:From: Jason DaviesDate: 2001-11-03 12:54:40
Subject: Re: [jason@netspade.com: DatabaseMetaData.java.diff]

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group