Re: finding bogus UTF-8

From: Vick Khera <vivek(at)khera(dot)org>
To: pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: finding bogus UTF-8
Date: 2011-02-15 21:20:40
Message-ID: AANLkTinYG6GZwbqLagqVRtX3phGGJ0m9e-nJOwyeRiJA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Tue, Feb 15, 2011 at 11:09 AM, Geoffrey Myers
<lists(at)serioustechnology(dot)com> wrote:
> comments would be appreciated.
>

If all you're doing is filtering stdin to stdout and deleting a range
of characters, it seems that tr would be a faster tool:

cat foo.txt | tr -d '\000-\008\013-\037\177-\377' > foo-cleaned.txt

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Marko Kreen 2011-02-15 21:21:16 Re: finding bogus UTF-8
Previous Message Alban Hertroys 2011-02-15 20:01:55 Re: Speeding up index scans by truncating timestamp?