Precision loss casting float to numeric

From: Chapman Flack <chap(at)anastigmatix(dot)net>
To: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Precision loss casting float to numeric
Date: 2018-02-26 03:22:38
Message-ID: 5A937D7E.60305@anastigmatix.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Back in

https://www.postgresql.org/message-id/4e384467-f28a-69ce-
75aa-4bc01125a39d%40anastigmatix.net

I got intrigued about casting float values to numeric. Two queries below
(one for float4, one for float8) show what happens for float values with
bits of precision from one up to the limit of the data type. Each value
generated has two 1 bits in the mantissa, one always at 2^0 and the other
at 2^(-p) for increasing p; in other words, each value is the sum of 1 and
2^(-p), and the last such value that compares unequal to 1 is the value
1 plus (FLT_EPSILON or DBL_EPSILON, respectively, for the machine). The
next value generated that way would be indistinguishable from 1.

I'm testing on Intel, IEEE-754 hardware, with 24 bits of precision for
float4 and 53 bits for float8.

TL;DR: casting these values to numeric loses their distinguishability
from 1, six bits early for float4 and five bits early for float8. Casting
to numeric and back again is even worse: you get only six of your 24 bits
back for float4, only 15 of 53 for float8.

Everyone expects the floats to be approximate types and that they
won't be able to exactly represent arbitrary values. However, that's
not how numeric is advertised, and the idea that a numeric can't safely
keep the precision of a float4 or float8, and give it back when you ask,
at best violates least astonishment and arguably is a bug.

I see where at least the first problem (5 or 6 bits lost casting to
numeric) comes in:

https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/utils/adt/numeric.c;h=6f400729713bfd942cc196b81d50bf25e4757315;hb=c4ba1bee68abe217e441fb81343e5f9e9e2a5353#l3298

https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/utils/adt/numeric.c;h=6f400729713bfd942cc196b81d50bf25e4757315;hb=c4ba1bee68abe217e441fb81343e5f9e9e2a5353#l3227

These conversions work by sprintf of the float value to its text
representation with FLT_DIG or DBL_DIG decimal digits of precision.

The mistake is subtle and understandable, because even the C standards
didn't catch it until recently, when they introduced the new constants
FLT_DECIMAL_DIG and DBL_DECIMAL_DIG, because they realized the constant
you need depends on what you're doing.

The old FLT_DIG and DBL_DIG tell you how many digits of a decimal
number you can count on recovering intact after conversion to float
or double and back to decimal. That's inside-out from what's needed
here, where the constant of interest is the number of decimal digits
necessary to reliably represent any float or double so you can get
those original bits back, and that's what the FLT_DECIMAL_DIG and
DBL_DECIMAL_DIG constants are, and they're larger (by 3 and 2 decimal
digits, respectively, on IEEE-754 hardware) than their FLT_DIG and
DBL_DIG counterparts.

So, a trivial fix for float4_numeric and float8_numeric would be to
change the constant used in the sprintf, and that would (I think) solve
at least the first precision-loss problem. But it would only compile
where the compiler is new enough to define the newer constants, and
my preferred fix is to just open-code the conversion using frexp
rather than going through the text representation at all. I'm working
on that patch.

The second problem (losing even more bits in the roundtrip to numeric
and back) suggests that in the other direction, numeric_float4 and
numeric_float8 need some love too, but right now I'm focused on the
to-numeric direction).

-Chap

Columns:
place - the place value of the 1 bit on the right; in the last row,
this is the machine epsilon for the type.

float4gt - whether 1+place is distinguishable from 1 using
float8gt all float4/float8 operations; true for every row.

numericgt - whether 1+place is still distinguishable from 1 after
casting to numeric.

rtgt - whether the roundtrip, 1+place cast to numeric and back,
is still distinguishable from 1. Ends up same as numericgt
(on my platform anyway).

rteq - whether the roundtrip, 1+place cast to numeric and back,
equals the original 1+place. Starts failing quite early.

WITH RECURSIVE
f4(place) AS (
VALUES (1::float4)
UNION
SELECT place/2::float4
FROM f4
WHERE 1::float4 + place/2::float4 > 1::float4
)
SELECT
place,
1::float4 + place > 1::float4 AS float4gt,
(1::float4 + place)::numeric > 1::numeric AS numericgt,
(1::float4 + place)::numeric::float4 > 1::float4 AS rtgt,
(1::float4 + place)::numeric::float4 = 1::float4 + place as rteq
FROM
f4;

place | float4gt | numericgt | rtgt | rteq
-------------+----------+-----------+------+------
1 | t | t | t | t
0.5 | t | t | t | t
0.25 | t | t | t | t
0.125 | t | t | t | t
0.0625 | t | t | t | t
0.03125 | t | t | t | t
0.015625 | t | t | t | f
0.0078125 | t | t | t | f
0.00390625 | t | t | t | f
0.00195312 | t | t | t | f
0.000976562 | t | t | t | f
0.000488281 | t | t | t | f
0.000244141 | t | t | t | f
0.00012207 | t | t | t | f
6.10352e-05 | t | t | t | f
3.05176e-05 | t | t | t | f
1.52588e-05 | t | t | t | f
7.62939e-06 | t | t | t | f
3.8147e-06 | t | f | f | f
1.90735e-06 | t | f | f | f
9.53674e-07 | t | f | f | f
4.76837e-07 | t | f | f | f
2.38419e-07 | t | f | f | f
1.19209e-07 | t | f | f | f
(24 rows)

WITH RECURSIVE
f8(place) AS (
VALUES (1::float8)
UNION
SELECT place/2::float8
FROM f8
WHERE 1::float8 + place/2::float8 > 1::float8
)
SELECT
place,
1::float8 + place > 1::float8 AS float8gt,
(1::float8 + place)::numeric > 1::numeric AS numericgt,
(1::float8 + place)::numeric::float8 > 1::float8 AS rtgt,
(1::float8 + place)::numeric::float8 = 1::float8 + place as rteq
FROM
f8;

place | float8gt | numericgt | rtgt | rteq
----------------------+----------+-----------+------+------
1 | t | t | t | t
0.5 | t | t | t | t
0.25 | t | t | t | t
0.125 | t | t | t | t
0.0625 | t | t | t | t
0.03125 | t | t | t | t
0.015625 | t | t | t | t
0.0078125 | t | t | t | t
0.00390625 | t | t | t | t
0.001953125 | t | t | t | t
0.0009765625 | t | t | t | t
0.00048828125 | t | t | t | t
0.000244140625 | t | t | t | t
0.0001220703125 | t | t | t | t
6.103515625e-05 | t | t | t | t
3.0517578125e-05 | t | t | t | f
1.52587890625e-05 | t | t | t | f
7.62939453125e-06 | t | t | t | f
3.814697265625e-06 | t | t | t | f
1.9073486328125e-06 | t | t | t | f
9.5367431640625e-07 | t | t | t | f
4.76837158203125e-07 | t | t | t | f
2.38418579101562e-07 | t | t | t | f
1.19209289550781e-07 | t | t | t | f
5.96046447753906e-08 | t | t | t | f
2.98023223876953e-08 | t | t | t | f
1.49011611938477e-08 | t | t | t | f
7.45058059692383e-09 | t | t | t | f
3.72529029846191e-09 | t | t | t | f
1.86264514923096e-09 | t | t | t | f
9.31322574615479e-10 | t | t | t | f
4.65661287307739e-10 | t | t | t | f
2.3283064365387e-10 | t | t | t | f
1.16415321826935e-10 | t | t | t | f
5.82076609134674e-11 | t | t | t | f
2.91038304567337e-11 | t | t | t | f
1.45519152283669e-11 | t | t | t | f
7.27595761418343e-12 | t | t | t | f
3.63797880709171e-12 | t | t | t | f
1.81898940354586e-12 | t | t | t | f
9.09494701772928e-13 | t | t | t | f
4.54747350886464e-13 | t | t | t | f
2.27373675443232e-13 | t | t | t | f
1.13686837721616e-13 | t | t | t | f
5.6843418860808e-14 | t | t | t | f
2.8421709430404e-14 | t | t | t | f
1.4210854715202e-14 | t | t | t | f
7.105427357601e-15 | t | t | t | f
3.5527136788005e-15 | t | f | f | f
1.77635683940025e-15 | t | f | f | f
8.88178419700125e-16 | t | f | f | f
4.44089209850063e-16 | t | f | f | f
2.22044604925031e-16 | t | f | f | f
(53 rows)

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Bapat 2018-02-26 04:05:45 Re: PlaceHolderVars in pushed down child-join cause error
Previous Message Michael Paquier 2018-02-26 02:57:05 Re: [bug fix] Cascaded standby cannot start after a clean shutdown