Re: [HACKERS] Troubles using German Umlauts with JDBC

From: Rene Pijlman <rene(at)lab(dot)applinet(dot)nl>
To: pgsql-jdbc(at)postgresql(dot)org
Cc: Dave Cramer <Dave(at)micro-automation(dot)net>, Barry Lind <barry(at)xythos(dot)com>
Subject: Re: [HACKERS] Troubles using German Umlauts with JDBC
Date: 2001-09-09 08:51:36
Message-ID: m1bmptchn6rme7g7mje6fmot7r7i77j9gl@4ax.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-jdbc

I've added a new section "Character encoding" to
http://lab.applinet.nl/postgresql-jdbc/, based on the
information from Dave and Barry.

I haven't seen a confirmation from pgsql-hackers or Bruce yet
that this issue will be added to the Todo list. I'm under the
impression that the backend developers don't see this as a
problem.

Regards,
René Pijlman

On Tue, 04 Sep 2001 10:40:36 -0700, Barry Lind wrote:
>I would like to add one additional comment. In current sources the jdbc
>driver detects (through a hack) that the server doesn't have multibyte
>enabled and then ignores the SQL_ASCII return value and defaults to the
>JVM's character set instead of using SQL_ASCII.
>
>The problem boils down to the fact that without multibyte enabled, the
>server has know way of specifiying which 8bit character set is being
>used for a particular database. Thus a client like JDBC doesn't know
>what character set to use when converting to UNICODE. Thus the best we
>can do in JDBC is use our best guess (JVM character set is probably the
>best default), and allow the user to explicitly specify something else
>if necessary.
>
>thanks,
>--Barry
>
>Rene Pijlman wrote:
>> [forwarding to pgsql-hackers and Bruce as Todo list maintainer,
>> see comment below]
>>
>> [insert with JDBC converts Latin-1 umlaut to ?]
>> On 04 Sep 2001 09:54:27 -0400, Dave Cramer wrote:
>>
>>>You have to set the encoding when you make the connection.
>>>
>>>Properties props = new Properties();
>>>props.put("user",user);
>>>props.put("password",password);
>>>props.put("charSet",encoding);
>>>Connection con = DriverManager.getConnection(url,props);
>>>where encoding is the proper encoding for your database
>>>
>>
>> For completeness, I quote the answer Barry Lind gave yesterday.
>>
>> "[the driver] asks the server what character set is being used
>> for the database. Unfortunatly the server only knows about
>> character sets if multibyte support is compiled in. If the
>> server is compiled without multibyte, then it always reports to
>> the client that the character set is SQL_ASCII (where SQL_ASCII
>> is 7bit ascii). Thus if you don't have multibyte enabled on the
>> server you can't support 8bit characters through the jdbc
>> driver, unless you specifically tell the connection what
>> character set to use (i.e. override the default obtained from
>> the server)."
>>
>> This really is confusing and I think PostgreSQL should be able
>> to support single byte encoding conversions without enabling
>> multi-byte.
>>
>> To the very least there should be a --enable-encoding-conversion
>> or something similar, even if it just enables the current
>> multibyte support.
>>
>> Bruce, can this be put on the TODO list one way or the other?
>> This problem has appeared 4 times in two months or so on the
>> JDBC list.
>>
>> Regards,
>> René Pijlman <rene(at)lab(dot)applinet(dot)nl>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Meskes 2001-09-09 09:09:26 CVS access
Previous Message Neil Padgett 2001-09-09 05:27:51 CVS commit messages

Browse pgsql-jdbc by date

  From Date Subject
Next Message Shanmugasundaram 2001-09-09 08:55:38 Regarding Error installing jdbc7.0-1.2.jar
Previous Message Bhuvaneswari 2001-09-09 08:39:37 Regarding Installation of jdbc7.0-1.1.jar with postgresql-7.0.3-2.i386.rpm