Re: Bug in UTF8-Validation Code?

From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: Albe Laurenz <all(at)adv(dot)magwien(dot)gv(dot)at>
Cc: pgsql(at)markdilger(dot)com, tgl(at)sss(dot)pgh(dot)pa(dot)us, alvherre(at)commandprompt(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Bug in UTF8-Validation Code?
Date: 2007-04-05 12:29:42
Message-ID: 20070405122942.GC17587@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Apr 05, 2007 at 11:52:14AM +0200, Albe Laurenz wrote:
> But isn't a simple fix for chr() and ascii(), which does not
> require a redesign, a Good Thing for 8.3 if possible? Something
> that maintains as much upward and/or Oracle compatibility as
> possible while doing away with ascii('EUR') returning 226 in UTF-8?

I think the earlier expressed idea of getting chr/ascii to bail on
non-ascii character isn't a bad one.

I think your idea is bad in the sense that I actually need to know the
encoding of the character I want to be able to use it. If I knew the
encoding already, I could just use encode().

What I was thinking of was something that, irrespective of the
encoding, gave me a string properly encoded with the character I want.
Since AFAIK Unicode is the only character set that actually numbers the
characters in a way not related to the encoding, so it would seem
useful to be able to give a unicode character number and get a string
with that character...

So your implemntation is simply:
1. Take number and make UTF-8 string
2. Convert it to database encoding.

> And I think - correct me if I am wrong - that conversion between
> character and integer representation of the character in the current
> database encoding is exactly that.

AFAIK there is no "integer representation" of a character in anything
other than Unicode. Unicode is the only case that cannot be handled by
a simple encode/decode.

Have a nice day,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2007-04-05 12:38:09 Re: Buildfarm failures en masse
Previous Message Andrew Dunstan 2007-04-05 12:18:25 Re: [HACKERS] --enable-xml instead of --with-libxml?