| From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
|---|---|
| To: | Bill Kim <billkimjh(at)gmail(dot)com> |
| Cc: | Daniel Gustafsson <daniel(at)yesql(dot)se>, pgsql-bugs(at)lists(dot)postgresql(dot)org |
| Subject: | Re: BUG #19524: NaN handling in btree_gist's float4/float8 opclasses |
| Date: | 2026-06-29 23:08:42 |
| Message-ID: | 2231547.1782774522@sss.pgh.pa.us |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-bugs |
I studied this and decided that it only partially fixes the problem.
Yes, we've got to fix the comparison functions, but we also have to
fix the penalty and distance functions. As things stand, the penalty
functions will probably return NaN for any input including a NaN,
which gistpenalty will clamp to zero, which we do not want because
it'll basically break all decisions about where to put things,
leading to a seriously inefficient index. (Maybe gistpenalty should
do something else with a NaN, but I'm hesitant to touch that right
now.) The penalty logic is also under the misapprehension that
multiplying everything by 0.49 keeps it from having to worry about
overflows; that won't fix things for infinities. The distance
functions will do the wrong things for NaN as well, probably breaking
any KNN search that happens across a NaN index entry.
So I fixed all that, and then was dismayed to find that the new
penalty logic wasn't reached at all in the regression tests, per code
coverage testing. We don't compute any penalties during GiST index
build unless it's a multicolumn index, so most of the per-datatype
tests aren't reaching that. It didn't seem like adding more index
entries partway through the test would fit very well into the
structure of float[48].sql, so instead I added simple tests of
two-column float indexes to improve the coverage. Also, I thought
we should test this logic by adding NaN (and infinities for good
measure) to the initial data load, rather than creating a new small
table which would only exercise the logic very minimally.
At the moment I'm leaning to back-patching this. We'd have to
release-note the need to reindex any btree_gist indexes containing
NaNs, but we've done similar things many times before.
regards, tom lane
| Attachment | Content-Type | Size |
|---|---|---|
| v2-0001-btree_gist-fix-NaN-handling-in-float4-float8-opcl.patch | text/x-diff | 26.0 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Masahiko Sawada | 2026-06-29 23:18:08 | Re: uuidv7 improperly accepts dates before 1970-01-01 |
| Previous Message | Bill Kim | 2026-06-29 15:50:11 | Re: BUG #19524: NaN handling in btree_gist's float4/float8 opclasses |