Re: Character encoding problems

From: John R Pierce <pierce(at)hogranch(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Character encoding problems
Date: 2011-12-09 08:20:21
Message-ID: 4EE1C4C5.9050209@hogranch.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 12/08/11 7:54 PM, Bruce Clay wrote:
> Is there a "proper" encoding type that I should use to load the word lists so they can be interoperable with the WordNet dataset that happily uses the UTF8 encoding?

some of your input data may be in other encodings, not UTF8, for
instance, LATIIN1. if you can identify these, and use SET
CLIENT_ENCODING=... at the appropriate times, you should be able to
import from the various data sources.

otherwise, you might have to run the data through some sort of filter
before you feed it to postgres, I dunno. I'm pretty sure 0x82 is not a
valid code in UTF8.

--
john r pierce N 37, W 122
santa cruz ca mid-left coast

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Marc Cousin 2011-12-09 10:26:07 Re: Hope for a new PostgreSQL era?
Previous Message Chris Travers 2011-12-09 04:17:57 Re: Hope for a new PostgreSQL era?