Quick Links

Re: Bug in UTF8-Validation Code?

From:	Martijn van Oosterhout <kleptog(at)svana(dot)org>
To:	Albe Laurenz <all(at)adv(dot)magwien(dot)gv(dot)at>
Cc:	Mark Dilger EXTERN <pgsql(at)markdilger(dot)com>, pgsql-hackers(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject:	Re: Bug in UTF8-Validation Code?
Date:	2007-04-03 14:36:18
Message-ID:	20070403143618.GA5405@svana.org
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Tue, Apr 03, 2007 at 11:43:21AM +0200, Albe Laurenz wrote:
> IMHO this is the only good and intuitive way for CHR() and ASCII().

Hardly. The comment earlier about mbtowc was much closer to the mark.
And wide characters are defined as Unicode points.

Basically, CHR() takes a unicode point and returns that character
in a string appropriately encoded. ASCII() does the reverse.

Just about every multibyte encoding other than Unicode has the problem
of not distinguishing between the code point and the encoding of it.
Unicode is a collection of encodings based on the same set.

Have a nice day,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

In response to

Re: Bug in UTF8-Validation Code? at 2007-04-03 09:43:21 from Albe Laurenz

Responses

Re: Bug in UTF8-Validation Code? at 2007-04-03 15:47:14 from Mark Dilger

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Tim Goodaire	2007-04-03 14:47:24	"Garbled" postgres logs
Previous Message	Luke Lonergan	2007-04-03 14:28:31	Re: Modifying TOAST thresholds