Skip site navigation (1) Skip section navigation (2)

Re: adjust chr()/ascii() to prevent invalidly encodeddata

From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: "Patches (PostgreSQL)" <pgsql-patches(at)postgresql(dot)org>
Subject: Re: adjust chr()/ascii() to prevent invalidly encodeddata
Date: 2007-09-21 01:20:19
Message-ID: (view raw, whole thread or download thread mbox)
Lists: pgsql-patches
Andrew Dunstan wrote:
> The attached patch is intended to ensure that chr() does not produce 
> invalidly encoded data, as recently discussed on -hackers. For UTF8, we 
> treat its argument as a Unicode code point; for all other multi-byte 
> encodings, we raise an error on any argument greater than 127. For all 
> encodings we raise an error if the argument is 0 (we don't allow null bytes 
> in text data). The ascii() function is adjusted so that it remains the 
> inverse of chr() - i.e. for UTF8 it returns the Unicode code point, and it 
> raises an error for any other multi-byte encoding if the aregument is 
> outside the ASCII range. I have tested thius inverse property across the 
> entire Unicode code point range, 0x01 .. 0x1ffff.

Hmm, is this what we had agreed?  I'm not sure I like it; if I'm using
chr() to produce characters, then the application is going to have to
worry about server_encoding in order to find the correct parameter to
pass to chr().

What I thought was the idea is that chr() always gets an Unicode code
point, and it converts the character to the server_encoding.  If the
character cannot be converted, then it raises an error.

Alvaro Herrera                      
The PostgreSQL Company - Command Prompt, Inc.

In response to


pgsql-patches by date

Next:From: Tom LaneDate: 2007-09-21 01:26:50
Subject: Re: curious regression failures
Previous:From: Tom LaneDate: 2007-09-21 01:18:05
Subject: pgstats dead-space tracking

Privacy Policy | About PostgreSQL
Copyright © 1996-2017 The PostgreSQL Global Development Group