Re: invalid byte sequence for encoding "UTF8": 0xf1612220

From: Cédric Villemain <cedric(dot)villemain(dot)debian(at)gmail(dot)com>
To: Craig Ringer <craig(at)postnewspapers(dot)com(dot)au>
Cc: AI Rumman <rummandba(at)gmail(dot)com>, pgsql-general General <pgsql-general(at)postgresql(dot)org>
Subject: Re: invalid byte sequence for encoding "UTF8": 0xf1612220
Date: 2011-05-12 08:32:50
Message-ID: BANLkTind_ROno+bFjNYZ-KLbB9vTQ65Jgw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

2011/5/12 Craig Ringer <craig(at)postnewspapers(dot)com(dot)au>:
> On 05/11/2011 03:16 PM, AI Rumman wrote:
>>
>> I am trying to migrate a database from Postgresql 8.2 to Postgresql 8.3
>> and getting the following error:
>>
>> pg_restore: [archiver (db)] Error from TOC entry 2764; 0 29708702 TABLE
>> DATA originaldata postgres
>> pg_restore: [archiver (db)] COPY failed: ERROR:  invalid byte sequence
>> for encoding "UTF8": 0xf1612220
>> HINT:  This error can also happen if the byte sequence does not match
>> the encoding expected by the server, which is controlled by
>> "client_encoding".
>> CONTEXT:  COPY wi_originaldata, line 3592
>>
>> I took a dump from 8.2 server and then tried to restore at 8.3.
>>
>> Both the client_encoding and server_encoding are UTF8 at both the servers.
>
> Newer versions of Pg got better at caching bad unicode. While this helps
> prevent bad data getting into the database, it's a right pain if you're
> moving data over from an older version with less strict checks.
>
> I don't know of any way to relax the checks for the purpose of importing
> dumps. You'll need to fix your dump files before loading them (by finding
> the faulty text and fixing it) or fix it in the origin database before
> migrating the data. Neither approach is nice or easy, but nobody has yet
> stepped up to write a unicode verifier tool that checks old databases' text
> fields against stricter rules...

The 2 following articles have SQL functions and documentation you may
find useful:

http://tapoueh.org/articles/blog/_Getting_out_of_SQL_ASCII,_part_1.html
http://tapoueh.org/articles/blog/_Getting_out_of_SQL_ASCII,_part_2.html

>
> --
> Craig Ringer
>
> --
> Sent via pgsql-general mailing list (pgsql-general(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

--
Cédric Villemain               2ndQuadrant
http://2ndQuadrant.fr/     PostgreSQL : Expertise, Formation et Support

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Albe Laurenz 2011-05-12 08:49:17 Re: Read Committed transaction with long query
Previous Message Stanislav Raskin 2011-05-12 08:26:20 Re: full text search to_tsquery performance with ispell dictionary