Re: float output precision questions

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Stephan Szabo <sszabo(at)megazone23(dot)bigpanda(dot)com>
Cc: "Pedro M(dot) Ferreira" <pfrazao(at)ualg(dot)pt>, Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: float output precision questions
Date: 2002-10-31 15:38:59
Message-ID: 7078.1036078739@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Stephan Szabo <sszabo(at)megazone23(dot)bigpanda(dot)com> writes:
> On Wed, 30 Oct 2002, Tom Lane wrote:
>> sprintf(ascii, "%.*g", DBL_DIG+1, num);
>> and similarly for float4. Given carefully written float I/O routines,
>> reading the latter output should reproduce the originally stored value.

> Well, on my system, it doesn't look like doing the above sprintfs will
> actually work for all numbers. I did a simple program using an arbitrary
> big number and the DBL_DIG+1 output when stuck into another double
> actually was a different double value. DBL_DIG+2 worked on my system,
> but...

Oh, you're right; I had forgotten about the effects of scale.
DBL_DIG=15 means that the system claims to distinguish all 15-digit
values, but in a binary system there's more headroom at the bottom end
of a decimal order of magnitude. For example, 15-digit values are fine:

regression=# select 100000000000001::float8 - 100000000000000::float8;
?column?
----------
1
(1 row)

regression=# select 999999999999999::float8 - 999999999999998::float8;
?column?
----------
1
(1 row)

but the 9-etc values are over three binary orders of magnitude larger
than the 1-etc values, and so they have three less spare bits at the
right end. The system would be lying to claim DBL_DIG=16:

regression=# select 9999999999999999::float8 - 9999999999999998::float8;
?column?
----------
2
(1 row)

even though values a little over 1e15 are represented perfectly
accurately:

regression=# select 1000000000000001::float8 - 1000000000000000::float8;
?column?
----------
1
(1 row)

If you experiment with 17-digit values, you find that the representable
values are about 2 counts apart near 1e16:

regression=# select 10000000000000001::float8 - 10000000000000000::float8;
?column?
----------
0
(1 row)

regression=# select 10000000000000002::float8 - 10000000000000000::float8;
?column?
----------
2
(1 row)

but they're about 16 counts apart near 9e16:

regression=# select 99999999999999992::float8 - 99999999999999990::float8;
?column?
----------
16
(1 row)

regression=# select 99999999999999991::float8 - 99999999999999990::float8;
?column?
----------
0
(1 row)

which is exactly what you'd expect seeing that the values are about a
factor of 8 apart.

Bottom line: if DBL_DIG=15 and the float arithmetic is binary, then
there are some double values that require 17 displayed digits to
distinguish, even though not all 16-digit numbers are distinct.

So I retract my original proposal and instead suggest that we offer
a switch to display either DBL_DIG or DBL_DIG+2 significant digits
(and correspondingly increase the digits for float4). The DBL_DIG+2
case should handle the need for exact dump/reload.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2002-10-31 15:40:21 Re: float output precision questions
Previous Message Robert E. Bruccoleri 2002-10-31 15:35:11 Test of PG7.3.2b2 on SGI Irix