Re: BUG #19340: Wrong result from CORR() function

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
Cc: Oleg Ivanov <o15611(at)gmail(dot)com>, Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #19340: Wrong result from CORR() function
Date: 2025-12-03 22:52:31
Message-ID: 923070.1764802351@sss.pgh.pa.us
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Attached is a fleshed-out patch proposal that fixes the related
aggregates and adds test cases.

Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> writes:
> Also, given that any NaN input results in NaN output, regardless of
> whether or not the inputs are all the same, the values for commonX and
> commonY don't matter if any input is NaN. I think that allows the
> accumulator function's tests to be simplified (no need to test if new
> values are NaN).

I'm not convinced about that. I have it as

if (newvalX != commonX || isnan(newvalX))
commonX = get_float8_nan();

and I think you're saying we could just write

if (newvalX != commonX)
commonX = get_float8_nan();

My concern is that if newvalX is NaN but commonX isn't, then
I believe that the non-NaN-aware "!=" test is supposed to return
false, which'd cause us to not update commonX to NaN as required.
Maybe we could make it work by writing

if (!(newvalX == commonX))
commonX = get_float8_nan();

but I don't have a lot of faith in C compilers getting that right.

> Then there's this:
> SELECT corr(1e-100 + x*1e-105, 1e-100 + x*1e-105) FROM generate_series(1, 3) x;
> The correct answer in this case is also 1, which could be achieved by
> taking the square roots of Sxx and Syy separately, before multiplying,
> but it might also be sensible to clamp the result to the range [-1,1].

Poking at this, I soon found a test case where even with the separate
sqrt() calls we'd produce a result slightly outside [-1, 1] (running
this test over more values of x is sufficient). So now I think we
should do both the separate sqrt and the clamp.

regards, tom lane

Attachment Content-Type Size
v1-detect-constant-input-exactly.patch text/x-diff 22.6 KB

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Daniel Gustafsson 2025-12-03 23:33:55 Re: BUG #19095: Test if function exit() is used fail when linked static
Previous Message Tom Lane 2025-12-03 16:33:36 Re: BUG #19340: Wrong result from CORR() function