Re: Can't restore a pg_dump due to encoding errors

From: adey <adey11(at)gmail(dot)com>
To: "Dan Harris" <fbsd(at)drivefaster(dot)net>
Cc: pgsql-admin(at)postgresql(dot)org
Subject: Re: Can't restore a pg_dump due to encoding errors
Date: 2007-04-07 21:29:39
Message-ID: 1c66bda80704071429p68409036oa2f384a61d8d5811@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

We had a similar conversion conflict some time back whe upgrading and found
using a search for anything not A-Z, 0-9, etc, was useful to find offensive
data and fix it.

On 4/8/07, Dan Harris <fbsd(at)drivefaster(dot)net> wrote:
>
> Hello all:
>
> I'm in a bit of a bind here. Today is my maintenance window for upgrading
> from
> 8.0.3 to 8.2.3. I did a pg_dumpall overnight ( resulting in a 72GB file )
> and
> then a pg_dump on one of my small databases this morning to test
> pg_restore.
> When trying to reload, I'm getting the following error:
>
> psql:sac.bak:2793776: ERROR: invalid byte sequence for encoding "UTF8":
> 0x92
>
> A google search came up with this link:
> http://www.hostingforum.ca/280472-invalid-byte-sequence-encoding-utf8.html
>
> Which sounds very simliar to what's going on here.
>
> Again the dump was done on 8.0.3, which sounds like it was before the
> anti-sql
> injection patch that would refuse to import "invalid" characters. The
> database
> ( for now ) *should* only be storing English/Latin characters. Although,
> the
> old server had LOCALE='C' previously ( this was unnecessary so I'm trying
> to
> drop that today ). I don't know if this would matter, but that's the only
> thing
> that was different from a bog standard Postgres setup besides the obvious
> tuning
> changes. I have never changed any of the encoding settings on client or
> server
> from the defaults.
>
> So far, I tried editing the dump file and changing the client_encoding to
> 'UTF8'
> at the top, hoping this might make a difference but did not.
>
> The advice in that thread seems to be "fix the bad data". But I am not
> seeing
> how I can tell exactly what or where this bad data is. Luckily, this
> small dump
> file is manageable by opening in vim and pointing to the line 2793776 the
> error
> references, however the line referenced is just the "\." at the end of a
> huge
> COPY section. I don't know which line inside that secion is the problem
> and
> there's enough of them that it'd be pretty impractical for me to visually
> scan
> the file looking for something ( I'm not even quite sure what to look for
> at
> this point ). And, if that is the only solution and I run into a problem
> restoring the 72GB dump file, I'm going to be in some serious trouble,
> especially if I am not alerted to the problem until *after* it's finished
> looking at all the COPY statements!
>
> What say the sages?
>
> Thanks
>
> -Dan
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
> message can get through to the mailing list cleanly
>

In response to

Browse pgsql-admin by date

  From Date Subject
Next Message Dan Harris 2007-04-07 21:32:17 Re: Can't restore a pg_dump due to encoding errors
Previous Message Dan Harris 2007-04-07 21:09:25 Can't restore a pg_dump due to encoding errors