Re: Unicode problem again

From: "Albe Laurenz" <laurenz(dot)albe(at)wien(dot)gv(dot)at>
To: "Garry Saddington *EXTERN*" <garry(at)schoolteachers(dot)co(dot)uk>, "pgsql-general General" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Unicode problem again
Date: 2008-06-24 07:16:37
Message-ID: D960CB61B694CF459DCFB4B0128514C2023F8F0C@exadv11.host.magwien.gv.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Garry Saddington wrote:
> I have the following error:
>
> Postgres 8.3 via psycopg 1.1.21 and zope 2.10.
>
> ProgrammingError Error Value: ERROR: character 0xe28099 of encoding "UTF8" has no equivalent in "LATIN1" select distinct
[...]

This is UNICODE 0x2019, a "right single quotation mark".

This is a "Windows character" - the only non-UNICODE codepages I
know that contain this character are the Microsoft codepages.

Microsoft programs are known to automagically change ASCII
characters to characters like that, so a frequent source of
such characters is copy & paste from a Microsoft text processor.

> I have changed client_encoding to Latin1 to get over errors
> caused by having the database in UTF8 and users trying to
> enter special characters like £ signs.
>
> Unfortunately, it seems there are already UTF8 encodings in
> the DB that have no equivalent in Latin1 from before the change.
>
> How can I get over this problem, and still allow special
> characters, ie have no error reports.

If you want to allow *all* special characters, you will have to
use UNICODE (and a pretty comprehensive font).
You could check if all software that you use supports UNICODE.

By using LATIN1 (or any other non-UNICODE codepage) you allow
*some* special characters. In that case you should not allow all
characters into your database.
You'll have to check data at entry time.
If you are confident that you will never need any non-LATIN1
characters in your database, you could create the database
with LATIN1 encoding; that way there will be an error message at
data entry time.

If you know that all your data is from and for Windows, you could
also use encoding WIN1252 throughout.

Yours,
Laurenz Albe

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Ivan Sergio Borgonovo 2008-06-24 07:37:28 table "inheritance" and uniform access
Previous Message Henry - Zen Search SA 2008-06-24 06:53:28 Re: replication