Re: [BUG] - Invalid UNICODE character sequence found

From: Csaba Nagy <nagy(at)ecircle-ag(dot)com>
To: Antonio Gallardo <antonio(at)apache(dot)org>
Cc: Postgres JDBC <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: [BUG] - Invalid UNICODE character sequence found
Date: 2004-01-09 11:05:23
Message-ID: 1073646322.15079.57.camel@coppola.ecircle.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-jdbc

Antonio,

As Kris Jurka said in his posts, there's nothing special with the "z"
characters, so your browser/os must be doing something wrong with the
input. But the error you reported is a clear indication that the backend
gets a byte sequence which is not properly encoded as UTF-8. If you pass
only strings to the driver, then this is a driver error (means the
driver encodes the string improperly).
To facilitate reproduction, I would suggest you to print out the unicode
characters of your query string with something like:

... embed this in your program:

for (int i = 0; i < queryString.length(); i++) {
System.out.print("\\u");
System.out.print(toHexString(queryString.charAt(i)));
}
System.out.println();
...
private static final char[] hexChars =
{ '0', '1', '2', '3', '4', '5', '6',
'7', '8', '9', 'A', 'B', 'C', 'D',
'E', 'F' };

public static String toHexString(int n)
{
char[] buffer = new char[4];
for (int i=0; i<4; i++) {
buffer[3-i] = hexChars[n & 0x0F];
n >>= 4;
}
return new String(buffer);
}

Then you can use the resulting string in the example program. This will
make sure that the other person on the other end of the email will have
exactly the same string as you - otherwise the you can bet that subtle
encoding differences get lost as you type.

Cheers,
Csaba.

On Fri, 2004-01-09 at 06:18, Antonio Gallardo wrote:
> Hi:
>
> First, here is the postgreSQL version used:
> PostgreSQL 7.3.4-RH on i386-redhat-linux-gnu, compiled by GCC
> i386-redhat-linux-gcc (GCC) 3.3.2 20031022 (Red Hat Linux 3.3.2-1)
>
> I am aware similar problem was already sent to this list (I read some of
> them), but I want to contribute with more interesting stuff:
>
> In the tested web application, we use postgreSQL JDBC driver. We have a 1
> field form where we allow the user to writte a search pattern a table. The
> generated SQL use LIKE to find for similars. Example:
>
> If the user write: "ant" then the answer will be:
>
> antonio
> antoine
> etc.
>
> This works fine, even if we left empty the form field, to show all the
> records.
>
> The interesting stuff I found is:
>
> If we write just "z", "Z" or any string with that include the chars "z" or
> "Z" at any point of the string in the field, then I got the below error.
> How is this posible? I not an UTF-8, ISO-8859-1 or SQL_ASCII expert, but
> for me "z" or "Z" is part of the ASCII that means a 1 byte code in UTF-8.
>
> That means the driver has problems with an normal "z" or "Z"?
>
> Note: The same apply for the drivers:
>
> pg73jdbc.jar
> pg74jdbc.jar
> pg74.1jdbc.jar
>
> Please explain.
>
> Best Regards,
>
> Antonio Gallardo
>
> Caused by: java.sql.SQLException: ERROR: Invalid UNICODE character
> sequence found (0xc000)
>
> at org.postgresql.core.QueryExecutor.execute(QueryExecutor.java:131)
> at
> org.postgresql.jdbc1.AbstractJdbc1Connection.ExecSQL(AbstractJdbc1Connection.java:505)
> at
> org.postgresql.jdbc1.AbstractJdbc1Statement.execute(AbstractJdbc1Statement.java:320)
> at
> org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:48)
> at
> org.postgresql.jdbc1.AbstractJdbc1Statement.executeQuery(AbstractJdbc1Statement.java:153)
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/docs/faqs/FAQ.html

In response to

Browse pgsql-jdbc by date

  From Date Subject
Next Message Martin Holz 2004-01-09 11:12:39 Re: jdbc1.AbstractJdbc1Statement.setBinaryStream bug and
Previous Message Kris Jurka 2004-01-09 10:03:22 Re: RE : Nmber of rows in a ResultSet