Skip site navigation (1) Skip section navigation (2)

Re: [JDBC] Invalid Character Data Problem

From: Kris Jurka <books(at)ejurka(dot)com>
To: Hunter Hillegas <lists(at)lastonepicked(dot)com>
Cc: PostgreSQL <pgsql-general(at)postgresql(dot)org>,Postgre JDBC List <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: [JDBC] Invalid Character Data Problem
Date: 2004-11-30 06:34:46
Message-ID: Pine.BSO.4.56.0411300128400.10837@leary.csoft.net (view raw or flat)
Thread:
Lists: pgsql-generalpgsql-jdbc

On Fri, 26 Nov 2004, Hunter Hillegas wrote:

> When I SELECT from a certain table, I see this JDBC exception:
> 
> "Invalid character data was found.  This is most likely caused by stored
> data containing characters that are invalid for the character set the
> database was created in.  The most common example of this is storing 8bit
> data in a SQL_ASCII database."
> 
> The database is indeed of type SQL_ASCII. The table stores mailing list data
> and has about 400,000 rows.
> 
> Looking at the data via psql, I see that some of the rows have strange
> characters in them, such as question marks where I would not expect them,
> etc...
> 
> What are my options? Is there a way to identify the 'bad' records, or the
> ones causing trouble?
> 

To really solve this problem you need to have a correctly encoded 
database.  This will involve a dump and restore process and possibly 
recoding your data.  This is straightforward if you know what 
encoding your data is, although it will cause some downtime.

To detect the bad data you can try various SELECTs with the JDBC driver 
and see what errors out.  The function below will determine if a 
particular field has data with the high bit set which is something the 
database really doesn't know what to do with.

SELECT pkcolumn, hashighbit(columna), hashighbit(columnb) FROM mytable;

Kris Jurka

CREATE OR REPLACE FUNCTION hashighbit(text) RETURNS boolean AS '
DECLARE
        i int;
BEGIN
        i := LENGTH($1);
        WHILE i > 0 LOOP
                IF ascii(substring($1, i, 1)) >= 128 THEN
                        RETURN true;
                END IF;
                i := i-1;
        END LOOP;
        RETURN false;
END;
' LANGUAGE 'plpgsql';


In response to

Responses

pgsql-jdbc by date

Next:From: Hunter HillegasDate: 2004-11-30 06:49:23
Subject: Re: [JDBC] Invalid Character Data Problem
Previous:From: Kris JurkaDate: 2004-11-30 06:18:05
Subject: Re: Translation update: pt_BR

pgsql-general by date

Next:From: Hunter HillegasDate: 2004-11-30 06:49:23
Subject: Re: [JDBC] Invalid Character Data Problem
Previous:From: Michael FuhrDate: 2004-11-30 06:13:10
Subject: Re: increase the column size

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group