| From: | Chengpeng Yan <chengpeng_yan(at)outlook(dot)com> |
|---|---|
| To: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
| Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> |
| Subject: | [PATCH] Fix overflow and underflow in regr_r2() |
| Date: | 2026-04-27 11:18:38 |
| Message-ID: | 33E01656-BB3B-46E9-A41F-24A01A7C35F4@outlook.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi,
While looking at the corr() overflow/underflow discussion [1], I noticed
that regr_r2() still computes
(Sxy * Sxy) / (Sxx * Syy)
directly. At very small or very large scales, those products can round
to zero or infinity even when the ratio itself is finite.
For example,
SELECT regr_r2(1e-100 + g * 1e-105,
1e-100 + g * 1e-105)
FROM generate_series(1, 3) g;
returns NaN without the patch, although the inputs are perfectly
correlated and the result should be 1.
corr() already has a stabilized calculation for the same Sxx * Syy
denominator scale. This patch factors that into a helper and lets
regr_r2() use it as a fallback when one of its direct products has
rounded to zero or infinity. Otherwise, regr_r2() keeps the existing
direct formula.
This preserves regr_r2()'s existing SQL-level special cases. The added
tests cover the fallback path and nearby NaN behavior.
Thoughts?
References:
[1] https://www.postgresql.org/message-id/flat/19340-6fb9f6637f562092%40postgresql.org
--
Best regards,
Chengpeng Yan
| Attachment | Content-Type | Size |
|---|---|---|
| v1-0001-Avoid-overflow-and-underflow-in-regr_r2.patch | application/octet-stream | 7.6 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Yugo Nagata | 2026-04-27 11:32:07 | Re: Track skipped tables during autovacuum and autoanalyze |
| Previous Message | JoongHyuk Shin | 2026-04-27 11:07:17 | Re: [PATCH] Prevent repeated deadlock-check signals in standby buffer pin waits |