Re: Strange behavior with polygon and NaN

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc: gkokolatos(at)pm(dot)me, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Strange behavior with polygon and NaN
Date: 2020-11-20 20:57:46
Message-ID: 1456736.1605905866@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I spent some more time looking at this patch.

Some experimentation shows that the changes around bounding box
calculation (ie float8_min_nan() and its call sites) seem to be
completely pointless: removing them doesn't change any of the regression
results. Nor does using float8_min_nan() in the other two bounding-box
calculations I'd asked about. So I think we should just drop that set
of changes and stick with the rule that bounding box upper and lower
values are sorted as per float.h comparison rules. This isn't that hard
to work with: if you want to know whether any NaNs are in the box, test
the upper limit (and *not* the lower limit) for isnan(). Moreover, even
if we wanted a different coding rule, we really can't have it because we
will still need to work with existing on-disk values that have bounding
boxes computed the old way.

I don't much like anything about float8_coef_mul(). In the first place,
FPzero() is an ugly, badly designed condition that we should be trying
to get rid of not add more dependencies on. In the second place, it's
really unclear why allowing 0 times Inf to be something other than NaN
is a good idea, and it's even less clear why allowing small-but-not-zero
times Inf to be zero rather than Inf is a good idea. In the third
place, the asymmetry between the two inputs looks more like a bug than
something we should actually want.

After some time spent staring at the specific case of line_closept_point
and its callers, I decided that the real problems there are twofold.
First, the API, or at least the callers' interpretation of this
undocumented point, is that whether the distance is undefined (NaN) is
equivalent to whether the closest point is undefined. This is not so;
in some cases we know that the distance is infinite even though we can't
calculate a unique closest point. Second, it's not making any attempt
to eliminate undefined cases up front. We can do that pretty easily
by plugging the point's coordinates into the line's equation Ax+By+C
and seeing whether we get a NaN. The attached 0002 is a subset patch
that just fixes these two issues, and I like the results it produces.

I wonder now whether the problems around line_interpt_line() and the
other intersection-ish functions wouldn't be better handled in a similar
way, by making changes to their API specs to be clearer about what
happens with NaNs and trying to eliminate ill-defined cases explicitly.
I've not tried to code that though.

Changing pg_hypot() the way you've done here is right out. See the
comment for the function: what it is doing now is per all the relevant
standards, and your patch breaks that. It's extremely unlikely that
doing it differently from IEEE and POSIX is a good idea.

Attached are the same old 0001 (adding the extra point_tbl entry)
and a small 0002 that fixes just line_closept_point. I've not
tried to figure out just which of the rest of your diffs should be
dropped given that. I did note though that the example you add
to func.sgml doesn't apply to this version of line_closept_point:

regression=# select point '(Infinity,Infinity)' <-> line '{-1,0,5}';
?column?
----------
NaN
(1 row)

regression=# select point '(Infinity,Infinity)' <-> line '{0,-1,5}';
?column?
----------
NaN
(1 row)

regards, tom lane

Attachment Content-Type Size
v6-0001-add-more-point-tests.patch text/x-diff 66.3 KB
v6-0002-fix-line_closept_point.patch text/x-diff 3.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2020-11-20 21:09:34 Re: Refactor pg_rewind code and make it work against a standby
Previous Message Robert Haas 2020-11-20 20:04:03 Re: xid wraparound danger due to INDEX_CLEANUP false