invalid byte sequence for encoding "UNICODE": 0xd9

From: Eric Walstad <eric(at)ericwalstad(dot)com>
To: SF Postgres <sfpug(at)postgresql(dot)org>
Subject: invalid byte sequence for encoding "UNICODE": 0xd9
Date: 2006-02-13 22:37:45
Message-ID: 200602131437.46440.eric@ericwalstad.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: sfpug

Hi everyone,

Question: How do I keep from receiving the subject error message when
loading data?

I'm working on a project that has a table with about 2.7M records,
mostly address data, in it. I wanted to copy some of this data to a
different machine (call it 'B') for testing purposes. I receive the
subject error message when I try to load the data into B that was
copied from machine 'A'. I've searched google, and the postgresql
docs but didn't find anything obviously helpful. Any pointers to
relevent docs and/or tips are greatly appreciated.

Here's the supporting info (with some stuff changed to protect the
innocent):
Machine 'A', the source machine, is running:
psql --version
psql (PostgreSQL) 7.3.4-RH

Machine 'B', the destination machine, is running:
psql --version
psql (PostgreSQL) 8.0.6

command I used to create the sql file:
pg_dump -U user_name dbname --table=table_name > output.sql

command I used to load the data into B:
psql -q dbname < output.sql

Error message I receive:
ERROR: invalid byte sequence for encoding "UNICODE": 0xd9
CONTEXT: COPY table_name, line 330517, column column_name: "2000 FOO
ST.�

Related comments:
I don't have experience with Unicode. I'm not sure what character
0xd9 is ('Latin capital U with grave'?). The data in question is for
addresses in California, so I don't expect it to have foreign
language characters in it. I've tried deleting the trouble character
by hand, with vim, but after trying this approach 5 times, I decided
I better find out what the real problem is.

Loading the data into another machine running psql (PostgreSQL) 7.4.7
works as expected.

I used 'tar cjf' to compress the data file before transferring it from
A to B:
On Machine A
tar --version
tar (GNU tar) 1.13.25

On Machine B
tar --version
tar (GNU tar) 1.15.1

Both machines are running the same version of bzip2: Version 1.0.2

Machine A is:
RedHat Enterprise Linux, 2.4.21-9.0.1.EL #1 Mon Feb 9 22:26:52 EST
2004 i686 athlon i386 GNU/Linux

Machine B is:
Fedora Core 4, 2.6.15-1.1830_FC4 #1 Thu Feb 2 17:23:41 EST 2006 i686
i686 i386 GNU/Linux

Both machines have postgresql that was installed from RPM binaries.

Thanks in advance,

Eric.

Responses

Browse sfpug by date

  From Date Subject
Next Message David Fetter 2006-02-14 05:29:39 Re: invalid byte sequence for encoding "UNICODE": 0xd9
Previous Message Josh Berkus 2006-02-07 18:02:34 HA help wanted for startup