Re: BUG #6015: to_hex and negative integer

From: Craig Ringer <craig(at)postnewspapers(dot)com(dot)au>
To: Marco Spiga <ctxspi(at)gmail(dot)com>, pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #6015: to_hex and negative integer
Date: 2011-05-26 23:47:08
Message-ID: 4DDEE67C.1080100@postnewspapers.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On 05/26/2011 11:21 PM, Marco Spiga wrote:
> Thanks Craig for your reply.
>
> I have read wikipedia link and what you have tell me is TRUE.
>
> But if I give these select:
> megaradius=# SELECT to_hex(-42);
> to_hex
> ----------
> ffffffd6
> (1 riga)
>
> megaradius=# SELECT to_hex(4294967254);
> to_hex
> ----------
> ffffffd6
> (1 riga)

Ha! Great example. What's happening is that 4294967254 cannot be
represented as a 32-bit signed integer, so it's being extended to a
64-bit integer. As a result, you're comparing a signed int32 to a signed
int64 using a bitwise comparison without extending the int32 to int64,
which is incorrect.

select pg_typeof(-42);
pg_typeof
-----------
integer
(1 row)

select pg_typeof(4294967254);
pg_typeof
-----------
bigint
(1 row)

When converting 4294967254 to a hex representation, to_hex omits leading
zeroes; it's really 0x00000000ffffffd6 . This is fine so long as you
retain information about the format the original number was in - in
particular whether it was a 32-bit or 64-bit value. It's also fine if
you're only ever dealing with unsigned or positive values. Alas, to_hex
_does_ deal with negative values, and by discarding leading zeroes it
discards information about the format of the original number.

In this particular case, 4294967254 doesn't fit into a signed 32 bit
int, but it DOES fit into an unsigned 32-bit int, so it's only eight hex
digits. The result is *really* (int64)0x00000000ffffffd6, but the
omission of leading zeroes is creating confusion.

Personally, I think it's incorrect for to_hex to drop leading zeroes if
it also supports negative numbers. It should do one or the other, or it
should extend all values up to 64-bits before conversion so there's no
confusion with negative numbers. Unfortunately I doubt any of those
options can be implemented for backward compat reasons, but adding
to_hex_32 and to_hex_64 that take only one width would help without
breaking BC.

You can work around this yourself by ensuring that you NEVER call to_hex
with a 32-bit integer. Always extend to a 64-bit value to eliminate
confusion.

SELECT to_hex('4294967254'::bigint);
to_hex
----------
ffffffd6
(1 row)

SELECT to_hex('-42'::bigint);
to_hex
------------------
ffffffffffffffd6

You can create a simple wrapper to do this:

CREATE OR REPLACE FUNCTION to_hex_64 (smallint) RETURNS text AS $$
SELECT to_hex($1::bigint);
$$ LANGUAGE 'sql';

CREATE OR REPLACE FUNCTION to_hex_64 (integer) RETURNS text AS $$
SELECT to_hex($1::bigint);
$$ LANGUAGE 'sql';

CREATE OR REPLACE FUNCTION to_hex_64 (bigint) RETURNS text AS $$
SELECT to_hex($1::bigint);
$$ LANGUAGE 'sql';

Alternately, you can implement a to_hex variant that replaces the
dropped leading zeroes by using pg_typeof(...) to decide whether the
argument is a 32-bit or 64-bit integer and padding the result of to_hex
appropriately. That way there's no confusion.

> SELECT 'true' AS test WHERE (SELECT to_hex(-42)) = (SELECT to_hex(4294967254));
> test

SELECT (SELECT to_hex(-42)) = (SELECT to_hex(4294967254));

?column?
----------
t

SELECT
(SELECT to_hex('-42'::bigint)) = (SELECT to_hex('4294967254'::bigint));

?column?
--------
f

--
Craig Ringer

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Tomonari Katsumata 2011-05-27 05:26:47 BUG #6042: unlogged table with Streaming Replication
Previous Message Tom Lane 2011-05-26 22:27:29 Re: BUG #6021: There is no difference between default and empty access privileges with \dp