Re: Inaccurate results from numeric ln(), log(), exp() and pow()

From: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Inaccurate results from numeric ln(), log(), exp() and pow()
Date: 2015-12-17 15:38:06
Message-ID: CAEZATCU-CKSVODJGE7S2XeY+W-pnk_B5phjUQ89K0S8JxaoV1g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 10 December 2015 at 20:02, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> It seems to be a loss of 4 digits in every case I've seen.
>
> I wouldn't have a problem with, say, throwing in an extra DEC_DIGITS worth
> of rscale in each of these functions so that the discrepancies tend to
> favor more significant digits out, rather than fewer. I don't know that
> it's worth trying to guarantee that the result is never fewer digits than
> before, and I certainly wouldn't want to make the rules a lot more complex
> than what's there now. But perhaps we could cover most cases easily.
>

Looking at this, it appears that those extra digits of precision for
log(0.5) in the old code are an anomaly that only occurs for a certain
range of inputs. According to the code comments these functions
intentionally output at least around 16 significant digits (or more if
the input has greater precision), so that they output at least the
precision of floating point. For example, in both 9.5 and HEAD:

select exp(5::numeric);
exp
--------------------
148.41315910257660

select exp(0.5::numeric);
exp
--------------------
1.6487212707001281

select ln(5::numeric);
ln
--------------------
1.6094379124341004

select ln(0.5::numeric);
ln
---------------------
-0.6931471805599453

select power(0.5::numeric, 0.4::numeric);
power
--------------------
0.7578582832551990

However, the old log() code would occasionally output 4 more digits
than that, due to it's mis-estimation of the result weight, which was
used to determine the output scale. So, for example, in 9.5:

select log(0.0005::numeric);
log
---------------------
-3.3010299956639812

select log(0.005::numeric);
log
---------------------
-2.3010299956639812

select log(0.05::numeric);
log
-------------------------
-1.30102999566398119521

select log(0.5::numeric);
log
-------------------------
-0.30102999566398119521

select log(5::numeric);
log
------------------------
0.69897000433601880479

select log(50::numeric);
log
--------------------
1.6989700043360188

select log(500::numeric);
log
--------------------
2.6989700043360188

i.e., for a certain range of inputs the result precision jumps from 16
to 20 digits after the decimal point, whereas in HEAD the precision of
the results is more consistent across the range:

select log(0.0005::numeric);
log
---------------------
-3.3010299956639812

select log(0.005::numeric);
log
---------------------
-2.3010299956639812

select log(0.05::numeric);
log
---------------------
-1.3010299956639812

select log(0.5::numeric);
log
---------------------
-0.3010299956639812

select log(5::numeric);
log
--------------------
0.6989700043360188

select log(50::numeric);
log
--------------------
1.6989700043360188

select log(500::numeric);
log
--------------------
2.6989700043360188

With other inputs, the actual number of significant digits can vary
between 16 and 17, but it's generally better behaved than the old
code, even though it sometimes produces fewer digits. I think it ought
to be sufficient to release note that the number of digits returned by
these functions may have changed, in addition to the results being
more accurate.

Regards,
Dean

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2015-12-17 16:00:47 Re: WIP: bloom filter in Hash Joins with batches
Previous Message Tomas Vondra 2015-12-17 15:34:47 Re: WIP: bloom filter in Hash Joins with batches