Re: unable to restore 8.2.5

From: "Mikel Lindsaar" <raasdnil(at)gmail(dot)com>
To: "Tore Halset" <halset(at)pvv(dot)ntnu(dot)no>, pgsql-admin(at)postgresql(dot)org
Subject: Re: unable to restore 8.2.5
Date: 2008-09-19 13:37:03
Message-ID: 57a815bf0809190637h49abd565x108ce52bc92a9d0a@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

On Fri, Sep 19, 2008 at 6:29 PM, Tore Halset <halset(at)pvv(dot)ntnu(dot)no> wrote:
> Looks like I have managed to insert an illegal character into the main
> system that does not conform to UTF-8. Anything I can and should do to work
> around this issue?

I have had the same problem previously and after a lot of help from
Tom Lane basically came up to the following...

You need to basically dump your table out (or a subset containing the
row ID and column that would have the bad data) in plain text and then
parse it with a script to detect invalid UTF-8 sequences, then find
what rows the bad data is in and go and fix it.

It is either that or you drop the data inserting some other character.
But this has obvious drawbacks.

I wrote a short ruby script that goes through a dumped file line by
line and puts each line through Iconv to parse it from UTF-8 to UTF-8,
if it fails it dumps the offending line to a log file.

A ruby script that would just print the offending row would go
something like this:

require 'iconv'
File.read(ARGV[0]).each do |line|
begin
Iconv.iconv('UTF-8', 'UTF-8', line)
rescue
puts "Failed: #{line}"
end
end

Save that in a file (find_invalid_utf8.rb) then run it with:

$ ruby find_invalid_utf8.rb my_dumped_table.csv

It's not pretty, and just dumps the raw output to the screen, but it
might do for you.

--
http://lindsaar.net/
Rails, RSpec and Life blog....

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Kevin Grittner 2008-09-19 13:52:18 Re: Multi-processors
Previous Message c k 2008-09-19 11:55:23 Multi-processors