Re: Character Decoding Problems

From: Evan Tsue <evan(at)windsormgmt(dot)com>
To: Barry Lind <blind(at)xythos(dot)com>
Subject: Re: Character Decoding Problems
Date: 2003-08-12 21:50:36
Message-ID: 17C3FC67-CD27-11D7-A787-000A95A08104@windsormgmt.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-jdbc

Barry,

Sure. I'm not certain what the best way to do this is, but if this is
not sufficient, then we can try something else. Here's the schema for
the table I created:

testdb=# \d messages
Table "public.messages"
Column | Type |
Modifiers
--------------+------------------------
+-------------------------------------------------------------------
message_uid | integer | not null default
nextval('public.messages_message_uid_seq'::text)
message_text | character varying(255) |
Indexes: messages_pkey primary key btree (message_uid)

I used this command to create that table:

CREATE TABLE messages (message_uid SERIAL PRIMARY KEY, message_text
VARCHAR(255));

The next thing I did from psql was this insert statement:

INSERT INTO messages (message_text) VALUES ('يرجى ادخال النص المراد
ترجمته');

I hope that the Arabic text I have in there comes out right for you.
If not, let me know.

So, if I do a SELECT * FROM messages; in psql, everything comes out
fine. Now, here's the Java code that I used to access this data:

import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.PreparedStatement;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.util.Properties;

public class LanguageTest {

public static void main(String[] args) {
Connection conn = null;
PreparedStatement ps = null;
ResultSet rs = null;

// Load the PostgreSQL driver.
try {
Class.forName("org.postgresql.Driver");
} catch (ClassNotFoundException e) {
System.err.println("Unable to find the PostgreSQL JDBC driver.");
System.exit(1);
}

try {
// Set the connection properties.
Properties info = new Properties();
info.put("user", "test");
info.put("password", "test");

// Create a new connection.
conn =
DriverManager.getConnection(
"jdbc:postgresql://127.0.0.1:5432/testdb",
info);

// Prepare the SQL statement.
ps = conn.prepareStatement("SELECT * FROM messages");

// Execute the query.
rs = ps.executeQuery();

// Iterate through the results.
if (rs.first()) {
do {
int messageId = rs.getInt(1);
String message = rs.getString(2);
System.out.println(
"UID: " + messageId + " Message: " + message);
} while (rs.next());
}
} catch (SQLException ex) {
ex.printStackTrace();
} finally {
// Close the connection.
try {
rs.close();
ps.close();
conn.close();
} catch (SQLException e1) {
}
}
}
}

Let me know what you think.

Evan

On Tuesday, Aug 12, 2003, at 17:38 US/Eastern, Barry Lind wrote:

> Evan,
>
> Can you provide a test case to demonstrate your problem. Many people
> are using the driver sucessfully with non-english characters. So I
> don't think the problem is as you describe it.
>
> thanks,
> --Barry
>
> Evan Tsue wrote:
>> Hi,
>> I've been having problems decoding non-Latin characters using the
>> Postgres JDBC driver. Here's the situation: I'm using postgres
>> 7.3.2 and I've created a test database using 'createdb -E UNICODE
>> testdb' to ensure that I really am using the UNICODE character set.
>> Using psql, I created a table using the following command: 'CREATE
>> TABLE messages (message_uid SERIAL PRIMARY KEY, message_text
>> VARCHAR(255))' to test character encoding and decoding. At that
>> point, I inserted a message that was in English. I also inserted a
>> message that was in Arabic. I did a select on that table using psql
>> and the values came back perfectly (I'm using MacOS X, so the
>> characters are displayed correctly).
>> Next, I did a select on the same table via JDBC. All I had the
>> program do was select on the table and print the results out to
>> standard output. The message in English was displayed perfectly.
>> However, the message that was in Arabic was displayed as a series of
>> question marks and spaces.
>> I eventually navigated my way through the JDBC driver source to
>> find that the problem is in the decodeUTF8 method in the
>> org.postgresql.core.Encoding class. Apparently, it doesn't seem to
>> be working properly for non-Western characters. I replaced the call
>> to that method with a call to the java.lang.String constructor and
>> now everything works perfectly.
>> In addition to Arabic, I took a random sample of Chinese,
>> Japanese, Russian and Korean text and inserted it into the database.
>> Using the original driver, I get the question marks. But, when I
>> used the String constructor, everything comes out fine.
>> Could someone please either fix the Encoding.decodeUTF8 method or
>> replace the call to that with a call to the String constructor?
>> Thanks,
>> Evan
>> ---------------------------(end of
>> broadcast)---------------------------
>> TIP 8: explain analyze is your friend
>

Browse pgsql-jdbc by date

  From Date Subject
Next Message zy7111 2003-08-13 01:28:36 Re: Character Decoding Problems
Previous Message Barry Lind 2003-08-12 21:44:27 Re: FW: FW: pgsql - query