Re: Problem with utf8 encoding

From: Sylvain Racine <syracine(at)sympatico(dot)ca>
To: pgsql-php(at)postgresql(dot)org
Subject: Re: Problem with utf8 encoding
Date: 2009-12-08 22:47:08
Message-ID: hfml1m$1jj6$1@news.hub.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-php

Hi Jorge,

I already encountered this problem while programming PHP web application
in French. The problem is with CLIENT_ENCODING variable of your
database. It must
fit with the encoding of your web page. Forget for the
moment to ask query inside pgAdminIII or a Windows shell using psql
(it's different ways to access your Postgresql database). You talked
about propel. Then I suppose you use Symphony framework.

Look at the encoding of the web page where you enter your data
in your browser. Usually, browser is set to automaticaly detect web page
encoding. I assume you have Internet Explorer. Check in the menu
View/Encoding... You should see an encoding name with a large dot in
front. It's the one used to show the web page. It's the same encoding
that Symphony (or PHP) uses to send your data to the database.

An other way to better know the encoding of the web page is to show
source code (View/Source on IE). If you can see near the first lines
something who looks like

<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">

In this example, iso-8859-1 is your web page encoding. This means you
need to set Postgresql client encoding variable to "latin1". You can
find here a charset table converter for Postgresql:
http://www.postgresql.org/docs/8.1/static/multibyte.html

Once you found your client encoding, you need to set this parameter in
your database. Here, it's important to understand difference between
server and client encoding. In your example, your database is set to
UTF-8 (what you gave as encoding parameter while you created your
database). It's your SERVER encoding. It's mean that all your data are
store in your database in UTF-8 encoding, with collate and ctype in
"Spanish_Peru.1252".

> /CREATE DATABASE sbs/
> / WITH OWNER = sbsadmin/
> / ENCODING = 'UTF8'/
> / LC_COLLATE = 'Spanish_Peru.1252'/
> / LC_CTYPE = 'Spanish_Peru.1252'/

Postgresql comes with the iconv librairy and can convert your data in
almost any other charset depending of your CLIENT encoding. Now, I talk
about your PHP application (web page). You can set your client encoding
for a while with "SET NAMES 'latin1'" at the begin of each time you send
a set of queries to your database, i.e. each time you refresh or change
your web page.

But this way is incomfortable if you not control your SQL query input.
You can set it once in your
postgresql.conf file. Set your database 'client_encoding' variable to
the encoding of your favorite PHP application. Don't forget to restart
your Postgresql Windows service once you changed it.

As mentionned in the documentation, you can also change your client
encoding default setting by changing PGCLIENTENCODING environment
variable. This can be set by right clicking on "Your computer" icon on
your windows desktop, click on Properties, click on Advanced System
Parameters to your left menu, then click on Advanced System Parameters
tab and
click on Environment Variables button. You have choice to set your
variable just inside your windows account (user variables) or for all
the machine (System variables). Then click on New button and add this
variable with the wished value. It's important to note that this
methode must be done on the computer where is put your Postgresql
database server. If your database is not on your computer, use another
method to change your client encoding setting.

This should resolve your problem. About use of psql through a shell, you
must know that Windows shell encoding is 'cp850'. It's a old charset
used with the DOS system who keep Windows some decade behind... Up to
this moment, I didn't find a Postgresql encoding where you can enter and
send your particular symbol... Another Microsoft problem...

Best regards

Sylvain Racine

Jorge Miranda Castañeda wrote:
> Hello everyone!
>
> I'm working in a project using postgres, propel, and php.
>
> My development environment is:
> SO: Windows vista Business SP2
> Postgres: Postgres v8.4
> Propel: Propel generator/runtime v1.4
> PHP: PHP v5.3
>
> Currently I'm struggling with a problem caused by the encoding.
Everytime I try to insert a row into the table CURRENCY, which has ID,
DESC, and SYMBOL as its columns, I get the following error:
> Unable to execute INSERT statement. [wrapped: SQLSTATE[22021]:
Character not in repertoire: 7 ERROR: invalid byte sequence for encoding
"UTF8": 0x80 HINT: This error can also happen if the byte sequence does
not match the encoding expected by the server, which is controlled by
"client_encoding".]
>
> I've created the database using this sentence:
> /CREATE DATABASE sbs/
> / WITH OWNER = sbsadmin/
> / ENCODING = 'UTF8'/
> / LC_COLLATE = 'Spanish_Peru.1252'/
> / LC_CTYPE = 'Spanish_Peru.1252'/
> / CONNECTION LIMIT = -1;/
>
> When I run the following commands in psql I get this:
> /sbs=#SHOW SERVER_ENCODING;/
> /server_encoding/
> /---------------/
> /UTF8/
> /(1 fila)/
>
> /sbs=#SHOW CLIENT_ENCODING;/
> /client_encoding/
> /---------------/
> /UTF8/
> /(1 fila)/
>
> The generated file by propel that manages the database connection
sets the charset to utf8 too. The insertion fails only everytime I try
to insert the currency symbols such as €, ¥, £, ₱, etc. I have no
problem with symbols such S/., $, p., Q, L or any other non-special
symbol. The curious thing is that when I run the sentence INSERT INTO
CURRENCY VALUES(3,'EURO','€'); using the query tool of pgAdminIII, I got
no error and the query was sucessfully executed. However running the
same sentence in psql, I got the row inserted but with ? instead of € as
the currency symbol.
> I think the utf8 is the apropiate charset to represent these symbols
but I have no idea why I'm getting this problem. I've spent the whole
day struggling with this problem and I can't figure it out yet. I have
tried many things, none of them worked. So this is why I'm here asking
for your help, as my last resource. Any guidance that could put me in
the right direction to address this problem will be really appreciated!
>
> Thanks in advance and I'm sorry If I went on a bit long...
>
> --
> Jorge Miranda Castañeda.
> Simbiosys Software S.A.C.
> www.simbiosysonline.com <http://www.simbiosysonline.com>
>

In response to

Browse pgsql-php by date

  From Date Subject
Next Message Nadine Mauch 2010-01-12 18:56:01 Capitalized Databases and columns names
Previous Message Andrew McMillan 2009-12-03 09:35:14 Re: Problem with utf8 encoding