From: | Jasen Betts <jasen(at)xnet(dot)co(dot)nz> |
---|---|
To: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: finding rows with invalid characters |
Date: | 2010-11-30 19:08:47 |
Message-ID: | id3i3v$j4l$1@reversiblemaps.ath.cx |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On 2010-11-21, Sim Zacks <sim(at)compulab(dot)co(dot)il> wrote:
> I am using PG 8.2.17 with UTF8 encoding.
> "PostgreSQL 8.2.17 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 4.1.1
> (Gentoo 4.1.1)"
>
> One of my tables somehow has invalid characters in it:
>> ERROR: invalid byte sequence for encoding "UTF8": 0xa9
>> HINT: This error can also happen if the byte sequence does not match
>> the encoding expected by the server, which is controlled by
>> "client_encoding".
> I have already manually found a number of the bad rows by running
> queries with text functions (upper) between groups of IDs until I found
> the specific bad row.
>
> 1) Is there a quicker way to get a list of all rows with invalid characters
dumpthe table, run it through iconv , diff agaist the original.
> 2) Shouldn't the database prevent these rows from being entered in the
> first place?
it should have, but that bug has now been fixed.
> 3) I have backups of this database (using -Fc) and I noticed that on
> restore, this table is not restored because of this error. Is there a
> way to fix the existing backups, or tell the restore to ignore bad rows
> instead of erroring out the whole table?
translate them to SQL (use pg_resore with no databse name)
then you can again use iconv to clean them.
use iconv something like this.
iconv --from-code UTF8 --to-code UTF8 -c < input_file > output_file
This will translate surrogates and drop other invalid characters.
if you have any constraints that place lower bounds on string-length
this has the potential to break them.
--
⚂⚃ 100% natural
From | Date | Subject | |
---|---|---|---|
Next Message | Merlin Moncure | 2010-11-30 19:21:57 | Re: how can i bugfix "idle in transaction" lockups ? |
Previous Message | Tom Lane | 2010-11-30 18:48:39 | Re: [GENERAL] column-level update privs + lock table |