From: | Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Inaccurate results from numeric ln(), log(), exp() and pow() |
Date: | 2015-12-17 15:38:06 |
Message-ID: | CAEZATCU-CKSVODJGE7S2XeY+W-pnk_B5phjUQ89K0S8JxaoV1g@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 10 December 2015 at 20:02, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> It seems to be a loss of 4 digits in every case I've seen.
>
> I wouldn't have a problem with, say, throwing in an extra DEC_DIGITS worth
> of rscale in each of these functions so that the discrepancies tend to
> favor more significant digits out, rather than fewer. I don't know that
> it's worth trying to guarantee that the result is never fewer digits than
> before, and I certainly wouldn't want to make the rules a lot more complex
> than what's there now. But perhaps we could cover most cases easily.
>
Looking at this, it appears that those extra digits of precision for
log(0.5) in the old code are an anomaly that only occurs for a certain
range of inputs. According to the code comments these functions
intentionally output at least around 16 significant digits (or more if
the input has greater precision), so that they output at least the
precision of floating point. For example, in both 9.5 and HEAD:
select exp(5::numeric);
exp
--------------------
148.41315910257660
select exp(0.5::numeric);
exp
--------------------
1.6487212707001281
select ln(5::numeric);
ln
--------------------
1.6094379124341004
select ln(0.5::numeric);
ln
---------------------
-0.6931471805599453
select power(0.5::numeric, 0.4::numeric);
power
--------------------
0.7578582832551990
However, the old log() code would occasionally output 4 more digits
than that, due to it's mis-estimation of the result weight, which was
used to determine the output scale. So, for example, in 9.5:
select log(0.0005::numeric);
log
---------------------
-3.3010299956639812
select log(0.005::numeric);
log
---------------------
-2.3010299956639812
select log(0.05::numeric);
log
-------------------------
-1.30102999566398119521
select log(0.5::numeric);
log
-------------------------
-0.30102999566398119521
select log(5::numeric);
log
------------------------
0.69897000433601880479
select log(50::numeric);
log
--------------------
1.6989700043360188
select log(500::numeric);
log
--------------------
2.6989700043360188
i.e., for a certain range of inputs the result precision jumps from 16
to 20 digits after the decimal point, whereas in HEAD the precision of
the results is more consistent across the range:
select log(0.0005::numeric);
log
---------------------
-3.3010299956639812
select log(0.005::numeric);
log
---------------------
-2.3010299956639812
select log(0.05::numeric);
log
---------------------
-1.3010299956639812
select log(0.5::numeric);
log
---------------------
-0.3010299956639812
select log(5::numeric);
log
--------------------
0.6989700043360188
select log(50::numeric);
log
--------------------
1.6989700043360188
select log(500::numeric);
log
--------------------
2.6989700043360188
With other inputs, the actual number of significant digits can vary
between 16 and 17, but it's generally better behaved than the old
code, even though it sometimes produces fewer digits. I think it ought
to be sufficient to release note that the number of digits returned by
these functions may have changed, in addition to the results being
more accurate.
Regards,
Dean
From | Date | Subject | |
---|---|---|---|
Next Message | Tomas Vondra | 2015-12-17 16:00:47 | Re: WIP: bloom filter in Hash Joins with batches |
Previous Message | Tomas Vondra | 2015-12-17 15:34:47 | Re: WIP: bloom filter in Hash Joins with batches |