Trouble with UTF-8 data

From: Janine Sisk <janine(at)furfly(dot)net>
To: pgsql-general(at)postgresql(dot)org
Subject: Trouble with UTF-8 data
Date: 2008-01-17 23:02:22
Message-ID: 4B0F3EA7-4AFF-4CDC-8BFB-9D0387553B62@furfly.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi all,

I'm moving a database from PG 7.2.4 to 8.2.6. I have already run
iconv on the dump file like so:

iconv -c -f UTF-8 -t UTF-8 -o out.dmp in.dmp

But I'm still getting this error when loading the data into the new
database:

ERROR: invalid byte sequence for encoding "UTF8": 0xeda7a1
HINT: This error can also happen if the byte sequence does not match
the encoding expected by the server, which is controlled by
"client_encoding".
CONTEXT: COPY article, line 2

FWIW this is the second database I've moved this way and for the
first one, iconv fixed all the byte sequence errors. No such luck
this time.

The 7.2.4 database has encoding UNICODE, and the 8.2.6 one is in UTF-8.

To make matters even more fun, the data is in Traditional Chinese
characters, which I don't read, so there seems to be no way for me to
identify the problem bits. I've loaded the dump file into a hex
editor and searched for the value that's reported as the problem but
it's not in the file.

Is there anything I can do to fix this?

Thanks in advance,

janine

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2008-01-17 23:10:16 Re: [ADMIN] postgresql in FreeBSD jails: proposal
Previous Message Mischa Sandberg 2008-01-17 22:52:03 Re: [ADMIN] postgresql in FreeBSD jails: proposal