Re: Unicode, php and postgresql

From: Michael Glaesemann <grzm(at)myrealbox(dot)com>
To: Didier Bretin <dbr(at)informactis(dot)com>
Cc: pgsql-php(at)postgresql(dot)org
Subject: Re: Unicode, php and postgresql
Date: 2003-12-09 08:59:50
Message-ID: 0BC6CE32-2A26-11D8-AAF1-0005029FC1A7@myrealbox.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-php

Hi Didier!

On Tuesday, December 9, 2003, at 05:34 PM, Didier Bretin wrote:

> Hi,
>
> I try to install a 7.4.0 + php for developping an application in
> unicode.
> Apparently I have no problem ;).
>
> But I don't understand enough the documentation of php. My postgresql
> server is configured in unicode, and my database is entirely in
> unicode.
> In my php.ini file I set no mbstring variables. When I'm connecting to
> the
> database, I SELECT the data and then I print them, with the charset
> utf-8, to the browser and all the characters are correctly displayed.
>
> My question is : is it the right way I don't have to configure anything
> in php for dealing with unicode :) ?

In my (admittedly limited) experience with PHP 4, Unicode, and
PostgreSQL, you can go a long way with the setup you descibe, i.e., not
using multi-byte string functions. However, all I do is move info in
and out of the database: I'm not doing any fancy-pants parsing of the
data in PHP—including data sanity checking (besides preventing SQL
insertion). I would *not* recommend doing it as I've done, though it
does work for me. It's something I'm working on rectifying in my own
code, and rather than have to fix it later, I'd recommend doing it
right the first time.

The reason it works is that PHP (at least as of PHP4) is agnostic about
the strings. It just takes it from the database and hands them to your
code, not trying to read it, parse it, check it, anything unless you
explicitly do so in the code.

Again, I don't recommend this (though I've been doing it myself)
because I don't believe you'll be able to do proper data
checking—especially if you're using higher order (i.e., not ASCII) code
points. For me, this means the Japanese that moves into my database is
completely unchecked, and like I said, that's Not Good. To do proper
checking of the Japanese, I'd need to use $mb_string functions.

I'm interested in hearing other's opinions on this as well,
particularly if they think I'm wrong—I can always learn something!

hth

Michael Glaesemann
grzm myrealbox com

In response to

Browse pgsql-php by date

  From Date Subject
Next Message Robert Treat 2003-12-10 02:16:23 Re: Auto commit Off how will it effect us ?
Previous Message Sai Hertz And Control Systems 2003-12-09 08:43:36 Re: Auto commit Off how will it effect us ?