ascii() picks up sign bit past CHAR value 127

From: pgsql-bugs(at)postgresql(dot)org
To: pgsql-bugs(at)postgresql(dot)org
Subject: ascii() picks up sign bit past CHAR value 127
Date: 2001-01-19 06:36:43
Message-ID: 200101190636.f0J6ahP11414@hub.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

ascii() returns negative ASCII values? (9sch1(at)txl(dot)com) reports a bug with a severity of 4
The lower the number the more severe it is.

Short Description
ascii() picks up sign bit past CHAR value 127

Long Description
The lack of an UNISIGNED INT1 attribute type forces those of us
who need a positive numeric byte type to use CHAR. The ascii() function ostensibly returns the numeric ASCII value of the corresponding CHAR attribute value - but once you get beyond the 0-127 ACCII character value range, the ascii() function starts picking up the active high order bit as a sign bit. This is not too surprising
but it is a bit bizarre since I tend to think of character encoding standards having the option of using the 127-255 character values.

Just in case anyone was wondering, there are many good reasons
to have an unsigned int1 type. For example, I am using one byte
numbers to define the bytes of an int4 (or int8) word. The first
byte partitions up the word's value range into 256 ranges. Within each of these the second word adds up to 256 value range partitions - and so on. This encodes a breadth (<256) and depth (<4/8) limited hierarchy designation as a single int4/int8 attribute. This
designation makes it fast to find items/records that fall under any node/sub-tree within the original hierarchical designation/category/etc. In other words, this is a trick for
*VERY* fast, albeit strictly limited, transitive closure.

First, the int4/int8 word is BTREE indexed. Then this index is range scanned to find all the items that appear in/under any node/sub-tree of the original hierarchy. That sure beats something like Oracle's dreadfully slow CONNECT BY syntax.

At any rate, we need to deal with unsigned numeric bytes - and
PostgreSQL doesn't make that easy. I imagine many folks have
already thought about extending the basic types with unsigned variants. Perhaps I have missed support for unsigned types in
the documentation (I don't think this is SQL std)? I imagine many folks have thought about supporting a one byte integer to round out the basic type suite (for many reasons). I'd like to add my voice to calls for both.

Thanks

Sample Code
select ascii(ichar(127));
select ascii(ichar(128));
select ascii(ichar(129));
select ascii(ichar(130));

No file was uploaded with this report

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2001-01-19 07:08:34 Re: ascii() picks up sign bit past CHAR value 127
Previous Message Bert de Jong 2001-01-18 21:15:44 minor fault report