Re: infinite histogram bounds and nan (Re: comment regarding double timestamps; and, infinite timestamps and NaN)

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Justin Pryzby <pryzby(at)telsasoft(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: infinite histogram bounds and nan (Re: comment regarding double timestamps; and, infinite timestamps and NaN)
Date: 2020-01-02 14:11:45
Message-ID: 12957.1577974305@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Justin Pryzby <pryzby(at)telsasoft(dot)com> writes:
> On Mon, Dec 30, 2019 at 02:18:17PM -0500, Tom Lane wrote:
>> This answer is simply broken. You've caused it to estimate half
>> of the bucket, which is an insane estimate for the given bucket
>> boundaries and WHERE constraint.

> I'm fine if the isnan() logic changes, but the comment indicates it's intended
> to be hit for an infinite histogram bound, but that doesn't work for timestamps
> (convert_to_scalar() should return (double)INFINITY and not
> (double)INT64_MIN/MAX).

I suppose the code you're looking at is

binfrac = (val - low) / (high - low);

/*
* Watch out for the possibility that we got a NaN or
* Infinity from the division. This can happen
* despite the previous checks, if for example "low"
* is -Infinity.
*/
if (isnan(binfrac) ||
binfrac < 0.0 || binfrac > 1.0)
binfrac = 0.5;

This doesn't really have any goals beyond "make sure we get a result
between 0.0 and 1.0, even if the calculation went pear-shaped for
some reason". You could make an argument that it should be like

if (isnan(binfrac))
binfrac = 0.5; /* throw up our hands for NaN */
else if (binfrac <= 0.0)
binfrac = 0.0; /* clamp in case of -Inf or -0 */
else if (binfrac > 1.0)
binfrac = 1.0; /* clamp in case of +Inf */

which would probably produce saner results in edge cases like these.
I think it'd also obviate the need for fooling with the conversion in
convert_to_scalar: while DT_NOBEGIN/DT_NOEND wouldn't produce exactly
the same result (hard 0.0 or 1.0) as an infinity, they'd produce
results very close to that.

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2020-01-02 14:13:27 Re: Disallow cancellation of waiting for synchronous replication
Previous Message Michael Paquier 2020-01-02 13:56:59 Re: Removal of support for OpenSSL 0.9.8 and 1.0.0