Re: BUG #19340: Wrong result from CORR() function

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
Cc: Oleg Ivanov <o15611(at)gmail(dot)com>, Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #19340: Wrong result from CORR() function
Date: 2025-12-02 20:26:43
Message-ID: 434484.1764707203@sss.pgh.pa.us
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> writes:
> On Tue, 2 Dec 2025 at 17:22, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> I wonder whether it'd be worth carrying additional state to
>> check that explicitly (instead of assuming that "if (Sxx == 0 ||
>> Syy == 0)" will catch it).

> I wondered the same thing. It's not nice to have to do that, but
> clearly the existing test for constant inputs is no good. The question
> is, do we really want to spend extra cycles on every query just to
> catch this odd corner case?

I experimented with the attached patch, which is very incomplete;
I just carried it far enough to be able to run performance checks on
the modified code, and so all the binary statistics aggregates except
corr() are broken. I observe about 2% slowdown on this test case:

SELECT corr( 0.09 , 0.09000001 ) FROM generate_series(1,100000000);

I think that any real-world usage is going to expend more effort
obtaining the input data than this test does, so 2% should be a
conservative upper bound on the cost. Seems to me that getting
NULL-or-not right is probably worth a percent or so.

If anyone feels differently, another idea could be to use a
separate state transition function for corr() that skips the
accumulation steps that corr() doesn't use. But I agree with
the pre-existing decision to use just one transition function
for all the binary aggregates.

If this seems like a reasonable approach, I'll see about finishing
out the patch.

regards, tom lane

Attachment Content-Type Size
wip-detect-constant-input-exactly.patch text/x-diff 4.9 KB

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2025-12-02 20:53:46 Re: BUG #19341: REPLACE() fails to match final character when using nondeterministic ICU collation
Previous Message Dean Rasheed 2025-12-02 18:54:20 Re: BUG #19340: Wrong result from CORR() function