Re: invalidly encoded strings

From: "Albe Laurenz" <laurenz(dot)albe(at)wien(dot)gv(dot)at>
To: "Tom Lane *EXTERN*" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Andrew Dunstan" <andrew(at)dunslane(dot)net>
Cc: "PostgreSQL-development" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: invalidly encoded strings
Date: 2007-09-10 07:29:52
Message-ID: D960CB61B694CF459DCFB4B0128514C22FB99F@exadv11.host.magwien.gv.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

Tom Lane wrote:
>> . for chr() under UTF8, it seems to be generally agreed
>> that the argument should represent the codepoint and the
>> function should return the correspondingly encoded character.
>> If so, possible the argument should be a bigint to
>> accommodate the full range of possible code points.
>> It is not clear what the argument should represent for other
>> multi-byte encodings for any argument higher than 127.
>> Similarly, it is not clear what ascii() should return in
>> such cases. I would be inclined just to error out.
>
> In SQL_ASCII I'd argue for allowing 0..255. In actual MB
> encodings, OK with throwing error.

I'd like to repeat my suggestion for chr() and ascii().

Instead of the code point, I'd prefer the actual encoding of
the character as argument to chr() and return value of ascii().

The advantage I see is the following:

- It would make these functions from oracle_compat.c
compatible with Oracle (Oracle's chr() and ascii() work
the way I suggest).

I agree with Tom's earlier suggestion to throw an error for
chr(0), although this is not what Oracle does.

Of course, if it is generally perceived that the code point
is more useful than the encoding, then Oracle compliance
is probably secondary.

Yours,
Laurenz Albe

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Gregory Stark 2007-09-10 08:04:54 Re: invalidly encoded strings
Previous Message Tom Lane 2007-09-10 04:03:06 Re: invalidly encoded strings

Browse pgsql-patches by date

  From Date Subject
Next Message Heikki Linnakangas 2007-09-10 08:03:07 Re: HOT patch - version 15
Previous Message Tom Lane 2007-09-10 04:03:06 Re: invalidly encoded strings