Mark Dilger wrote:
> Tom Lane wrote:
>> Mark Dilger <pgsql(at)markdilger(dot)com> writes:
>>>> pgsql=# select chr(14989485);
>>>> (1 row)
>> Is there a principled rationale for this particular behavior as
>> opposed to any other?
>> In particular, in UTF8 land I'd have expected the argument of chr()
>> to be interpreted as a Unicode code point, not as actual UTF8 bytes
>> with a randomly-chosen endianness.
>> Not sure what to do in other multibyte encodings.
> "Not sure what to do in other multibyte encodings" was pretty much my
> rationale for this particular behavior. I standardized on network byte
> order because there are only two endianesses to choose from, and the
> other seems to be a more surprising choice.
> I looked around on the web for a standard for how to convert an integer
> into a valid multibyte character and didn't find anything. Andrew,
> Supernews has said upthread that chr() is clearly wrong and needs to be
> fixed. If so, we need some clear definition what "fixed" means.
> Any suggestions?
Another issue to consider when thinking about the corect definition of chr() is
that ascii(chr(X)) = X. This gets weird if X is greater than 255. If nothing
else, the name "ascii" is no longer appropriate.
In response to
pgsql-hackers by date
|Next:||From: Josh Berkus||Date: 2007-04-02 22:06:31|
|Subject: Mentor for ASync I/O for SoC|
|Previous:||From: Mark Dilger||Date: 2007-04-02 22:02:21|
|Subject: Re: Bug in UTF8-Validation Code?|