| From: | Madeleine Thompson <madeleineth(at)gmail(dot)com> | 
|---|---|
| To: | pgsql-hackers(at)lists(dot)postgresql(dot)org | 
| Cc: | Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> | 
| Subject: | Re: BUG #15307: Low numerical precision of (Co-) Variance | 
| Date: | 2018-09-27 05:12:23 | 
| Message-ID: | 153802514313.25845.6696084477831363882.pgcf@coridan.postgresql.org | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-bugs pgsql-hackers | 
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:       tested, passed
Spec compliant:           not tested
Documentation:            not tested
This is my first PostgreSQL review. Hopefully I'm getting it right. I independently ran into the bug in question and found that the author had already written a patch. I'm happy the author wrote this.
(1) I am not experienced with PostgreSQL internals, so I can't comment on the coding style or usage of internal functions.
                                                                                 
(2) The patch applies cleanly to ce4887bd025b95c7b455fefd817a418844c6aad3. "make check", "make check-world", and "make install" pass.
(3) The patch addresses the numeric inaccuracy reported in the bug. (Yay!) I believe the tests are sufficient to confirm that it works as intended. I don't think the test coverage *before* this patch was sufficient, and I appreciate the improved test coverage of the combine functions. I verified the new tests manually.
(4) The comments cite Youngs & Cramer (1971). This is a dated citation. It justifies its algorithms based on pre-IEEE754 single-precision arithmetic. Most notably, it assumes that floating-point division is not exact. That said, the algorithm is still a good choice; it is justified now because compared to the standard Welford variance, it is less prone to causing pipeline stalls on a modern CPU. Maybe link to Schubert & Gentz 2018 (https://dl.acm.org/citation.cfm?id=3223036) instead. The new float8_combine combine code is (almost) S&G eqn. 22; the float8_accum code is S&G eqn. 28. float8_regr_accum and float8_regr_combine are S&G eqn. 22.
(5) I think the comment /* Watch out for roundoff error producing a negative variance */ and associated check are obsolete now.
The new status of this patch is: Waiting on Author
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Madeleine Thompson | 2018-09-27 05:46:25 | Re: BUG #15307: Low numerical precision of (Co-) Variance | 
| Previous Message | Michael Paquier | 2018-09-27 04:44:30 | Re: BUG #15402: Hot standby server with archive_mode=on keeps initial WAL segments | 
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Madeleine Thompson | 2018-09-27 05:46:25 | Re: BUG #15307: Low numerical precision of (Co-) Variance | 
| Previous Message | Michael Paquier | 2018-09-27 05:10:59 | Re: Let's stop with the retail rebuilds of src/port/ files already |