Re: Changing character set when the damage is done

From: Michael Fuhr <mike(at)fuhr(dot)org>
To: Alexis Paul Bertolini <bertolini(at)computer(dot)org>
Cc: pgsql-sql(at)postgresql(dot)org
Subject: Re: Changing character set when the damage is done
Date: 2006-12-24 16:57:34
Message-ID: 20061224165734.GA43931@winnie.fuhr.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

On Sun, Dec 24, 2006 at 05:05:37PM +0100, Alexis Paul Bertolini wrote:
> They show up in PHP, PgAdminIII and psql. All the same. A lowercase e
> with a grave accent appears as a capital A with the cedilla, followed by
> an umlaut (just the umlaut, on its own). So to answer your question,
> they are two characters.

Are you sure that's not a tilde (a wavy line above the A) instead
of a cedilla (a hook below the A)? The UTF-8 encoding for lowercase e
with grave is 0xc3 0xa8, which in ISO-8859-1 (LATIN1) or Windows-1252
is uppercase A with tilde followed by a diaeresis (an umlaut on its
own). Does the data appear correctly if you do either of the following?

SELECT convert(colname, 'utf8', 'latin1') FROM tablename;
SELECT convert(colname, 'utf8', 'win1252') FROM tablename;

If you use characters like "smart quotes" or the Euro sign then
you'll probably need to use win1252 instead of latin1. Does the
following show a Euro sign or does it show blank?

SELECT convert('\342\202\254', 'utf8', 'win1252');

--
Michael Fuhr

In response to

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message Alexis Paul Bertolini 2006-12-24 17:44:04 Re: Changing character set when the damage is done
Previous Message Alexis Paul Bertolini 2006-12-24 16:05:37 Re: Changing character set when the damage is done