Skip site navigation (1) Skip section navigation (2)

Re: [JDBC] Troubles using German Umlauts with JDBC

From: Barry Lind <barry(at)xythos(dot)com>
To: Rene Pijlman <rene(at)lab(dot)applinet(dot)nl>
Cc: pgsql-jdbc(at)postgresql(dot)org, Alexander Troppmann <talex(at)globalinxs(dot)de>, pgsql-hackers(at)postgresql(dot)org, pgman(at)candle(dot)pha(dot)pa(dot)us, Dave Cramer <Dave(at)micro-automation(dot)net>
Subject: Re: [JDBC] Troubles using German Umlauts with JDBC
Date: 2001-09-04 17:40:36
Message-ID: 3B951214.1080104@xythos.com (view raw or flat)
Thread:
Lists: pgsql-hackerspgsql-jdbc
Rene,

I would like to add one additional comment.  In current sources the jdbc 
driver detects (through a hack) that the server doesn't have multibyte 
enabled and then ignores the SQL_ASCII return value and defaults to the 
JVM's character set instead of using SQL_ASCII.

The problem boils down to the fact that without multibyte enabled, the 
server has know way of specifiying which 8bit character set is being 
used for a particular database.  Thus a client like JDBC doesn't know 
what character set to use when converting to UNICODE.  Thus the best we 
can do in JDBC is use our best guess (JVM character set is probably the 
best default), and allow the user to explicitly specify something else 
if necessary.

thanks,
--Barry

Rene Pijlman wrote:
> [forwarding to pgsql-hackers and Bruce as Todo list maintainer,
> see comment below]
> 
> [insert with JDBC converts Latin-1 umlaut to ?]
> On 04 Sep 2001 09:54:27 -0400, Dave Cramer wrote:
> 
>>You have to set the encoding when you make the connection.
>>
>>Properties props = new Properties();
>>props.put("user",user);
>>props.put("password",password);
>>props.put("charSet",encoding);
>>Connection con = DriverManager.getConnection(url,props);
>>where encoding is the proper encoding for your database
>>
> 
> For completeness, I quote the answer Barry Lind gave yesterday. 
> 
> "[the driver] asks the server what character set is being used
> for the database.  Unfortunatly the server only knows about
> character sets if multibyte support is compiled in. If the
> server is compiled without multibyte, then it always reports to
> the client that the character set is SQL_ASCII (where SQL_ASCII
> is 7bit ascii).  Thus if you don't have multibyte enabled on the
> server you can't support 8bit characters through the jdbc
> driver, unless you specifically tell the connection what
> character set to use (i.e. override the default obtained from
> the server)."
> 
> This really is confusing and I think PostgreSQL should be able
> to support single byte encoding conversions without enabling
> multi-byte. 
> 
> To the very least there should be a --enable-encoding-conversion
> or something similar, even if it just enables the current
> multibyte support.
> 
> Bruce, can this be put on the TODO list one way or the other?
> This problem has appeared 4 times in two months or so on the
> JDBC list.
> 
> Regards,
> René Pijlman <rene(at)lab(dot)applinet(dot)nl>
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
> 
> http://www.postgresql.org/search.mpl
> 
> 



In response to

Responses

pgsql-hackers by date

Next:From: Tom LaneDate: 2001-09-04 18:24:16
Subject: Re: Bad behaviour when inserting unspecified variable length datatypes
Previous:From: Bruce MomjianDate: 2001-09-04 17:34:37
Subject: Re: Bytea/Base64 encoders for libpq - interested?

pgsql-jdbc by date

Next:From: Liam StewartDate: 2001-09-04 18:13:26
Subject: driver source code indentation
Previous:From: chris markiewiczDate: 2001-09-04 17:39:36
Subject: Re: error - NOTICE: current transaction...MORE DETAIL...

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group