Re: how to ignore invalid byte sequence for encoding without using sql_ascii?

From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: "detrox(at)gmail(dot)com" <detrox(at)gmail(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: how to ignore invalid byte sequence for encoding without using sql_ascii?
Date: 2007-10-02 07:30:25
Message-ID: 20071002073025.GA12469@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Thu, Sep 27, 2007 at 02:28:27AM -0700, detrox(at)gmail(dot)com wrote:
> I am now importing the dump file of wikipedia into my postgresql using
> maintains/importDump.php. It fails on 'ERROR: invalid byte sequence
> for encoding UTF-8'. Is there any way to let pgsql just ignore the
> invalid characters ( i mean that drop the invalid ones ), that the
> script will keep going without die on this error.

No, postgres does not destroy data. It you want bits of your data
removed you need to write your own tool to do it.

That said, are you sure that the data you're importing is UTF-8?

Have a nice day,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tomasz Ostrowski 2007-10-02 07:42:36 Re: more problems with count(*) on large table
Previous Message Albe Laurenz 2007-10-02 07:13:39 Re: Find out encoding of data