Skip site navigation (1) Skip section navigation (2)

Re: getTables() doesn't handle umlauts correctly

From: Thomas Kellerer <spam_eater(at)gmx(dot)net>
To: pgsql-jdbc(at)postgresql(dot)org
Subject: Re: getTables() doesn't handle umlauts correctly
Date: 2010-11-23 14:22:10
Message-ID: icgimh$46u$1@dough.gmane.org (view raw or flat)
Thread:
Lists: pgsql-jdbc
Kris Jurka, 23.11.2010 09:13:
> As the discussion has shown, trying to determine who is at fault here
> is not trivial. The best way to show that postgresql (driver or
> server if you're seeing it in pgadmin too) is at fault is to create a
> test case creating the table and then querying the metadata. It would
> be helpful to use either a Java or PG escape code for the special
> character so it doesn't get mangled by either mail clients or build
> environments. Then use String.codePointAt to print out the actual
> data for both the table name used for construction and returned by
> the metadata. That would conclusively show that PG is at fault
> somewhere.

OK, this is my test program:

Connection con = DriverManager.getConnection("jdbc:postgresql://localhost:5432/postgres", "postgres", "postgres");
Statement stmt = con.createStatement();

stmt.executeUpdate("create table umlaut_ö (some_data varchar(10))");
stmt.executeUpdate("insert into umlaut_ö (some_data) values ('öäü')");

ResultSet rs = con.getMetaData().getTables(null, "public", "umlaut%", null);
if (rs.next()) {
   String name = rs.getString("TABLE_NAME");
   System.out.println("table name: " + name);
   System.out.print("  codepoints:");
   for (int i = 0; i < name.length();)
   {
     int cp = name.codePointAt(i);
     System.out.print(" " + cp);
     i += Character.charCount(cp);
   }
   System.out.println("");
}
rs.close();

rs = stmt.executeQuery("select count(*) from umlaut_ö where some_data = 'öäü'");
if (rs.next()) {
   int count = rs.getInt(1);
   System.out.println("number of rows: " + count);
}
rs.close();

rs = stmt.executeQuery("select some_data from umlaut_ö");
if (rs.next()) {
   String data = rs.getString(1);
   System.out.println("data: " + data);
   System.out.print("  codepoints:");
   for (int i = 0; i < data.length();)
   {
     int cp = data.codePointAt(i);
     System.out.print(" " + cp);
     i += Character.charCount(cp);
   }
   System.out.println("");
}
rs.close();

stmt.executeUpdate("drop table umlaut_ö");

stmt.close();
con.close();


The output on my computer is:

table name: umlaut_test_�
   codepoints: 117 109 108 97 117 116 95 116 101 115 116 95 65533
number of rows: 1
data: öäü
   codepoints: 246 228 252

So it seems that the umlauts in the table name are returned with a different encoding than the data itself.

Nevertheless the umlauts when being *sent* to the server are always treated correctly (as part of a table name as well as column values)

This is with 9.0.1 on Windows XP using postgresql-9.0-801.jdbc4.jar

Regards
Thomas


In response to

Responses

pgsql-jdbc by date

Next:From: Radosław SmoguraDate: 2010-11-23 14:31:03
Subject: Re: TypeInfoCache.getPGArrayElement - determine if array
Previous:From: Kris JurkaDate: 2010-11-23 08:13:42
Subject: Re: getTables() doesn't handle umlauts correctly

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group