Quick Links

Re: Bug in UTF8-Validation Code?

From:	Andrew - Supernews <andrew+nonews(at)supernews(dot)com>
To:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Bug in UTF8-Validation Code?
Date:	2007-04-03 13:43:08
Message-ID:	slrnf14mfc.2i67.andrew+nonews@atlantis.supernews.net
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On 2007-04-03, "Albe Laurenz" <all(at)adv(dot)magwien(dot)gv(dot)at> wrote:
> According to RFC 2279, the Euro,
> Unicode code point 0x20AC = 0010 0000 1010 1100,
> will be encoded to 1110 0010 1000 0010 1010 1100 = 0xE282AC.
>
> IMHO this is the only good and intuitive way for CHR() and ASCII().

It is beyond ludicrous for functions like chr() or ascii() to convert a
Euro sign to 0xE282AC rather than 0x20AC. "Intuitive"? There is _NO SUCH
THING_ as 0xE282AC as a representation of a Unicode character - there is
either the code point, 0x20AC (which is a _number_), or the sequences of
_bytes_ that represent that code point in various encodings, of which the
three-byte sequence 0xE2 0x82 0xAC is the one used in UTF-8.

Functions like chr() and ascii() should be dealing with the _number_ of the
code point, not with its representation in transfer encodings.

--
Andrew, Supernews
http://www.supernews.com - individual and corporate NNTP services

In response to

Re: Bug in UTF8-Validation Code? at 2007-04-03 09:43:21 from Albe Laurenz

Responses

Re: Bug in UTF8-Validation Code? at 2007-04-03 15:47:27 from Albe Laurenz

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Marko Kreen	2007-04-03 13:46:44	Re: PL/Python warnings in CVS HEAD
Previous Message	Bruce Momjian	2007-04-03 13:42:31	Re: [HACKERS] Full page writes improvement, code update