Re: Copying into Unicode - Correcting Errors

From: Hunter Hillegas <lists(at)lastonepicked(dot)com>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: PostgreSQL <pgsql-general(at)postgresql(dot)org>, Postgre JDBC List <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: Copying into Unicode - Correcting Errors
Date: 2004-11-24 16:25:39
Message-ID: BDC9F603.4DBB3%lists@lastonepicked.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-jdbc

Peter,

Thanks for the reply.

Perhaps I should go into some more detail about what is going on.

Originally, the database was in SQL_ASCII and the data had been imported via
COPY from a text file. The text file is no longer available. The data went
into the table just fine.

When selecting from the table via JDBC, I see this exception:

'Invalid character data was found. This is most likely caused by stored
data containing characters that are invalid for the character set the
database was created in. The most common example of this is storing 8bit
data in a SQL_ASCII database.'

Ok, so I've never seen this but I do a little investigation and some of the
stuff I see online suggests that I should change the database encoding.

When I try UNICODE, I get the error below during my data import.

The 'bad' data looks like this when I SELECT:

| Ver?onica |

Is it possible that this is an issue with beta5 in conjunction with the JDBC
driver and encoding? I didn't see a CHANGELOG note that would make me
suspicious but I'm not sure I would know if it I saw it.

Hunter

> From: Peter Eisentraut <peter_e(at)gmx(dot)net>
> Date: Wed, 24 Nov 2004 11:19:44 +0100
> To: Hunter Hillegas <lists(at)lastonepicked(dot)com>
> Cc: PostgreSQL <pgsql-general(at)postgresql(dot)org>
> Subject: Re: [GENERAL] Copying into Unicode - Correcting Errors
>
> Hunter Hillegas wrote:
>> I need to import a file into a Unicode database.
>>
>> I am getting an error:
>>
>> ERROR: Unicode characters greater than or equal to 0x10000 are not
>> supported
>> CONTEXT: COPY mailing_list_entry, line 30928, column
>> first_last_name: "Ver?nica"
>
> If your file really does have Unicode characters greater than or equal
> to 0x10000, then I don't have a good answer.
>
> But more often, this error means that your file is not in Unicode in the
> first place. If so, set the client encoding to the real encoding of
> your file, e.g.
>
> export PGCLIENTENCODING=LATIN1
>
> --
> Peter Eisentraut
> http://developer.postgresql.org/~petere/

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Ben 2004-11-24 16:27:17 Re: Nesting Stored Procedure Calls
Previous Message Larry White 2004-11-24 16:17:26 Nesting Stored Procedure Calls

Browse pgsql-jdbc by date

  From Date Subject
Next Message Tom Lane 2004-11-24 17:57:46 Re: UNICODE and 8.0.0beta5
Previous Message Xavier Poinsard 2004-11-24 15:45:46 Re: Patch for jdbc escaped functions