Re: [PATCH] Fix overflow and underflow in regr_r2()

From: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
To: Chengpeng Yan <chengpeng_yan(at)outlook(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: [PATCH] Fix overflow and underflow in regr_r2()
Date: 2026-05-16 09:39:30
Message-ID: CAEZATCUaV+qmBCh0zv0EPdBvhE2skHCzkRw78VNjBMCX9Z7h+w@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, 27 Apr 2026 at 12:18, Chengpeng Yan <chengpeng_yan(at)outlook(dot)com> wrote:
>
> While looking at the corr() overflow/underflow discussion [1], I noticed
> that regr_r2() still computes
>
> (Sxy * Sxy) / (Sxx * Syy)
>
> directly. At very small or very large scales, those products can round
> to zero or infinity even when the ratio itself is finite.

Yes, good point.

> corr() already has a stabilized calculation for the same Sxx * Syy
> denominator scale. This patch factors that into a helper and lets
> regr_r2() use it as a fallback when one of its direct products has
> rounded to zero or infinity. Otherwise, regr_r2() keeps the existing
> direct formula.

The comments need work -- in particular float8_regr_r2() needs a
comment explaining the new overflow/underflow checks, similar to the
comment in float8_corr(). In fact, doing that, I think it's preferable
to just keep this change local to float8_regr_r2(), rather than
refactoring into a helper function for just a few lines of code.

This new check in float8_regr_r2():

+ if (Sxy == 0 && !isnan(Sxx) && !isnan(Syy))
+ PG_RETURN_FLOAT8(0.0);

seems pointless. It's optimising for a special case that will very
rarely occur in practice, and which is handled fine by the general
code. We don't want to slow down the common code path for such rare
special cases.

I noticed that this new overflow test case:

+SELECT regr_r2(1e154::float8 * g, 1e154::float8 * g)
+ FROM generate_series(1, 2) g;
+ regr_r2
+---------
+ 1
+(1 row)

only produces 1 because it's run with a reduced extra_float_digits
value. I think it's better to use the test values "1e100 + g * 1e95",
which still trigger the overflow on HEAD, but more reliably produce 1,
regardless of the extra_float_digits setting, making it less likely
that there will be variations between platforms. That's also more
consistent with the other nearby test cases.

Attached is a v2 patch with those changes, plus a little more tidying
up of the regression tests.

Arguably, this is a bug fix, but given the lack of prior complaints, I
think we should apply this to HEAD only, like 6498287696d, meaning it
will only appear in PG19.

Regards,
Dean

Attachment Content-Type Size
v2-0001-Improve-overflow-underflow-handling-in-regr_r2.patch text/x-patch 7.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Zsolt Parragi 2026-05-16 11:15:34 Re: [PATCH] Add NESTED_STATEMENTS option to EXPLAIN
Previous Message Etsuro Fujita 2026-05-16 09:10:00 Re: Import Statistics in postgres_fdw before resorting to sampling.