From: | Ranier Vilela <ranier(dot)vf(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Maxim Orlov <orlovmg(at)gmail(dot)com>, Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Aggregate Function corr does not always return the correct value |
Date: | 2025-08-26 18:01:24 |
Message-ID: | CAEudQAobLi=Kvk4KSw6PSgcgLWEDTGS7fr1aEEu_23V7MzFuwg@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Em ter., 26 de ago. de 2025 às 14:34, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> escreveu:
> Maxim Orlov <orlovmg(at)gmail(dot)com> writes:
> > One of the clients complained as to why the query for calculating the
> > correlation coefficient with the CORR function yielded such weird
> > results. After a little analysis, it was discovered that they were
> > calculating the correlation coefficient for two sets, one of which is
> > more or less random and the other of which is simply a set of constant
> > values (0.09 if that matters). As a result, they were attaining
> > unexpected results. However, as far as I am aware, they should have
> > received NULL because it is impossible to calculate the standard
> > deviation for such a set.
>
> [ shrug... ] Calculations with float8 are inherently inexact, so
> it's unsurprising that we sometimes fail to detect that the input
> is exactly a horizontal or vertical line. I don't think there is
> anything to be done here that wouldn't end in making things worse.
>
With the below checking
if (Sxx == 0.0 && Syy == 0.0)
PG_RETURN_NULL();
This test returns NaN
WITH dataset AS (SELECT x, 0.125 AS y FROM generate_series(0, 5) AS x)
SELECT corr(x, y) FROM dataset;
But I can't say if this answer (NaN) makes things worse.
best regards,
Ranier Vilela
From | Date | Subject | |
---|---|---|---|
Next Message | Christoph Berg | 2025-08-26 19:47:50 | Re: pgsql: oauth: Add unit tests for multiplexer handling |
Previous Message | DINESH NAIR | 2025-08-26 17:48:45 | Re: Aggregate Function corr does not always return the correct value |