Skip site navigation (1) Skip section navigation (2)

ascii() picks up sign bit past CHAR value 127

From: pgsql-bugs(at)postgresql(dot)org
To: pgsql-bugs(at)postgresql(dot)org
Subject: ascii() picks up sign bit past CHAR value 127
Date: 2001-01-19 06:36:43
Message-ID: 200101190636.f0J6ahP11414@hub.org (view raw or flat)
Thread:
Lists: pgsql-bugs
ascii() returns negative ASCII values? (9sch1(at)txl(dot)com) reports a bug with a severity of 4
The lower the number the more severe it is.

Short Description
ascii() picks up sign bit past CHAR value 127

Long Description
The lack of an UNISIGNED INT1 attribute type forces those of us 
who need a positive numeric byte type to use CHAR.  The ascii() function ostensibly returns the numeric ASCII value of the corresponding CHAR attribute value - but once you get beyond the 0-127 ACCII character value range, the ascii() function starts picking up the active high order bit as a sign bit.  This is not too surprising 
but it is a bit bizarre since I tend to think of character encoding standards having the option of using the 127-255 character values.

Just in case anyone was wondering, there are many good reasons 
to have an unsigned int1 type.  For example, I am using one byte 
numbers to define the bytes of an int4 (or int8) word.  The first 
byte partitions up the word's value range into 256 ranges.  Within each of these the second word adds up to 256 value range partitions - and so on.  This encodes a breadth (<256) and depth (<4/8) limited hierarchy designation as a single int4/int8 attribute.  This 
designation makes it fast to find items/records that fall under any node/sub-tree within the original hierarchical designation/category/etc.  In other words, this is a trick for 
*VERY* fast, albeit strictly limited, transitive closure.  

First, the int4/int8 word is BTREE indexed.  Then this index is range scanned to find all the items that appear in/under any node/sub-tree of the original hierarchy.  That sure beats something like Oracle's dreadfully slow CONNECT BY syntax.

At any rate, we need to deal with unsigned numeric bytes - and 
PostgreSQL doesn't make that easy.  I imagine many folks have 
already thought about extending the basic types with unsigned variants.  Perhaps I have missed support for unsigned types in 
the documentation (I don't think this is SQL std)?  I imagine many folks have thought about supporting a one byte integer to round out the basic type suite (for many reasons). I'd like to add my voice to calls for both.

Thanks

Sample Code
select ascii(ichar(127));
select ascii(ichar(128));
select ascii(ichar(129));
select ascii(ichar(130));

No file was uploaded with this report


Responses

pgsql-bugs by date

Next:From: Tom LaneDate: 2001-01-19 07:08:34
Subject: Re: ascii() picks up sign bit past CHAR value 127
Previous:From: Bert de JongDate: 2001-01-18 21:15:44
Subject: minor fault report

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group