Re: Is this error correct/possible?

From: Kris Jurka <books(at)ejurka(dot)com>
To: Joost Kraaijeveld <J(dot)Kraaijeveld(at)Askesis(dot)nl>
Cc: pgsql-jdbc(at)postgresql(dot)org
Subject: Re: Is this error correct/possible?
Date: 2005-08-23 14:34:31
Message-ID: Pine.BSO.4.62.0508230919260.32740@leary.csoft.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-jdbc

This message is in MIME format. The first part should be readable text,
while the remaining parts are likely unreadable without MIME-aware tools.

--0-1086122986-1124807671=:32740
Content-Type: TEXT/PLAIN; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: QUOTED-PRINTABLE

On Tue, 23 Aug 2005, Joost Kraaijeveld wrote:

> I have a database which is created in PostgreSQL 8.0.3 which is filled
> with a PostgreSQL 7.4.7 database backup. Both databases where created
> with SQL_ASCII.
>
> "Invalid character data was found. This is most likely caused by stored
> data containing characters that are invalid for the character set the
> database was created in. The most common example of this is storing
> 8bit data in a SQL_ASCII database."
>
> Inspection of the row (using pgadmin3) shows that there is the char "ü"
> in a char(40) columns.
>
> Questions:
>
> 1. Is a "ü" allowed in a SQL_ASCII database and a column of char(40)?

It is allowed to be stored in the database because SQL_ASCII is not a real
encoding. SQL_ASCII allows you to store anything you want and doesn't
require you to tell the server what character set it actually is. The
problem is on the return end, the JDBC driver asks the server to always
return data in UTF-8 by setting the client_encoding appropriately. The
server has no idea what the original encoding of the data was, so it has
no means of converting it to unicode. It may happen to look like
u-double-dot in your particular pgadmin3 client's encoding, but if that
client's encoding was different it would show up as a different character.
This is why the JDBC driver bails out instead of just picking a
random character.

> 2. If so, is this a JDBC bug?

No. The JDBC documentation clearly states not to choose a SQL_ASCII
database for your data.

http://jdbc.postgresql.org/documentation/80/your-database.html

> 3. If not, is this a PostgreSQL bug, allowing a non-allowed character in
> a column?
>

This is how the SQL_ASCII encoding works, for better or worse (mostly
worse). The problem is that you've likely had two different clients
connect with different client_encodings which ends up storing two
different encodings in the database which is then going to break the other
client when it tries to display it.

Kris Jurka
--0-1086122986-1124807671=:32740--

In response to

Responses

Browse pgsql-jdbc by date

  From Date Subject
Next Message Joost Kraaijeveld 2005-08-23 14:46:50 Re: Is this error correct/possible?
Previous Message Marc Herbert 2005-08-23 13:42:21 Re: Is this error correct/possible?