How to remove non-UTF values from a table?

From: Phoenix Kiula <phoenix(dot)kiula(at)gmail(dot)com>
To: PG-General Mailing List <pgsql-general(at)postgresql(dot)org>
Subject: How to remove non-UTF values from a table?
Date: 2009-12-14 11:03:25
Message-ID: e373d31e0912140303u75e51c92u4b5d7b58083bf658@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

An easy question for some I hope.

I have a DB from 8.2 days that when I now dump and try to take into
the 8.3.7, it gives me errors about utf-8 stuff.

I tried searching this list's archives but could not come up with an answer.

Google returns some sites like these:
http://sniptools.com/databases/finding-non-utf8-values-in-postgresql -
but I'm not clear on how to use them.

Following the SQL on this site I could identify some columns that
contain text like this:

"Évolution générale de la situation démographique"

So my guess is that the non-English characters were originally not
getting written in proper utf-8 variants.

Is there any SQL possibility to find these columns and replace them
with utf-8 equivalents using some postgresql commands? Couldn't find
anything in the "Strings functions" (chapter 9 of manual).

We're on CentOS.

Thanks!

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Phoenix Kiula 2009-12-14 11:04:06 Re: How to remove non-UTF values from a table?
Previous Message Philippe Lang 2009-12-14 09:13:06 Dependency tracking tool