Re: Bug in UTF8-Validation Code?

From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Martijn van Oosterhout <kleptog(at)svana(dot)org>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Mark Dilger <pgsql(at)markdilger(dot)com>, Albe Laurenz <all(at)adv(dot)magwien(dot)gv(dot)at>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Bug in UTF8-Validation Code?
Date: 2007-04-04 13:26:38
Message-ID: 20070404132638.GB8549@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Martijn van Oosterhout wrote:
> On Tue, Apr 03, 2007 at 01:06:38PM -0400, Tom Lane wrote:
> > I think it's probably defensible for non-Unicode encodings. To do
> > otherwise would require (a) figuring out what the equivalent concept to
> > "code point" is for each encoding, and (b) having a separate code path
> > for each encoding to perform the mapping. It's not clear that there
> > even is an answer to (a), and (b) seems like more work than chr() is
> > worth. But we know what the right way is for Unicode, so we should
> > special case that one.
>
> I dunno. I find it odd that if I want a pl/pgsql function to return a
> Euro symbol, it has to know what encoding the DB is in. Though I
> suppose that would call for a unicode_chr() function.

Right -- IMHO what we should be doing is reject any input to chr() which
is beyond plain ASCII (or maybe > 255), and create a separate function
(unicode_char() sounds good) to get an Unicode character from a code
point, converted to the local client_encoding per conversion_procs.

So if I'm in Latin-1 and ask for the Euro sign, this should fail because
Latin-1 does not have the euro sign. If I'm in Latin-9 I should get the
Euro.

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2007-04-04 13:29:36 Re: xpath_array with namespaces support
Previous Message NikhilS 2007-04-04 13:19:56 Re: Auto Partitioning