Re: invalidly encoded strings

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Tatsuo Ishii <ishii(at)postgresql(dot)org>
Cc: andrew(at)dunslane(dot)net, laurenz(dot)albe(at)wien(dot)gv(dot)at, pgsql-hackers(at)postgresql(dot)org
Subject: Re: invalidly encoded strings
Date: 2007-09-11 01:42:10
Message-ID: 8729.1189474930@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

Tatsuo Ishii <ishii(at)postgresql(dot)org> writes:
> If you regard the unicode code point as simply a number, why not
> regard the multibyte characters as a number too?

Because there's a standard specifying the Unicode code points *as
numbers*. The mapping from those numbers to UTF8 strings (and other
representations) is well-defined by the standard.

> Also I'm wondering you what we should do with different
> backend/frontend encoding combo.

Nothing. chr() has always worked with reference to the database
encoding, and we should keep it that way.

BTW, it strikes me that there is another hole that we need to plug in
this area, and that's the convert() function. Being able to create
a value of type text that is not in the database encoding is simply
broken. Perhaps we could make it work on bytea instead (providing
a cast from text to bytea but not vice versa), or maybe we should just
forbid the whole thing if the database encoding isn't SQL_ASCII.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2007-09-11 01:57:43 Re: invalidly encoded strings
Previous Message Tatsuo Ishii 2007-09-11 01:15:49 Re: invalidly encoded strings

Browse pgsql-patches by date

  From Date Subject
Next Message Andrew Dunstan 2007-09-11 01:57:43 Re: invalidly encoded strings
Previous Message Tatsuo Ishii 2007-09-11 01:15:49 Re: invalidly encoded strings