Re: evil characters #bfef cause dump failure

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Christian Fowler <spider(at)viovio(dot)com>
Cc: pgsql-admin list <pgsql-admin(at)postgresql(dot)org>
Subject: Re: evil characters #bfef cause dump failure
Date: 2004-11-15 21:00:22
Message-ID: 10656.1100552422@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

Christian Fowler <spider(at)viovio(dot)com> writes:
> server_encoding
> -----------------
> SQL_ASCII

> whoa! yikes, I bet this has a lot to do with it? I really wanted to keep
> everything UNICODE end-to-end. I must have forgotten --encoding on my
> initdb? Anything I can do at this point?

Hmm ... the safe way would be dump-n-reload but that's not working for
you. What you can try is to alter the pg_database.encoding value for
that database, then start fresh backends (any existing ones won't notice
the change). Worst case if that doesn't make life good is to change it
back.

The real problem is that you've got invalid unicode data in the database
(I'm not an expert, but I think that #bf is a 1-byte UTF8 sequence and
then #ef starts a 3-byte sequence, so if this comes within 2 characters
of end-of-line that would explain your dump problem). You had better
fix the data first before trying to lock down the encoding. Once you
change the encoding, backend internal operations will start spitting up
on any stored bad data, whereas right now it's just passing it through
unchanged.

The safest way might be a dump-n-reload in any case, since reloading
into a fresh UNICODE database will catch bad data. If you try manual
repairs you're likely to miss some places :-(

regards, tom lane

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Markus Bertheau 2004-11-15 23:13:21 Re: evil characters #bfef cause dump failure
Previous Message Christian Fowler 2004-11-15 20:44:20 Re: evil characters #bfef cause dump failure