Re: Strange behavior with polygon and NaN

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc: gkokolatos(at)pm(dot)me, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Strange behavior with polygon and NaN
Date: 2020-11-10 19:30:08
Message-ID: 2200966.1605036608@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I spent some time looking this over, and have a few thoughts:

1. I think it's useful to split the test changes into two patches,
as I've done below: first, just add the additional row in point_tbl
and let the fallout from that happen, and then in the second patch
make the code changes. This way, it's much clearer what the actual
behavioral changes are. Some of them don't look right, either.
For instance, in the very first hunk in geometry.out, we have
this:

- (Infinity,1e+300) | {1,0,5} | NaN | NaN
+ (Infinity,1e+300) | {1,0,5} | Infinity | Infinity

which seems right, and also this:

- (1e+300,Infinity) | {1,-1,0} | Infinity | Infinity
- (1e+300,Infinity) | {-0.4,-1,-6} | Infinity | Infinity
- (1e+300,Infinity) | {-0.000184615384615,-1,15.3846153846} | Infinity | Infinity
+ (1e+300,Infinity) | {1,-1,0} | NaN | NaN
+ (1e+300,Infinity) | {-0.4,-1,-6} | NaN | NaN
+ (1e+300,Infinity) | {-0.000184615384615,-1,15.3846153846} | NaN | NaN

which does not. Why aren't these distances infinite as well?
For instance, {1,-1,0} is the line "x = y". We could argue about
whether it'd be sensible to return zero for the distance between that
and the point (inf,inf), but surely any point with one inf and one
finite coordinate must be an infinite distance away from that line.
There's nothing ill-defined about that situation.

2. Rather than coding around undesirable behavior of float8_min,
it seems like it's better to add a primitive to float.h that
does what you want, ie "NaN if either input is NaN, else the
smaller input". This is more readable, and possibly more efficient
(depending on whether the compiler is smart enough to optimize
away redundant isnan checks). I did that in the attached.

3. Looking for other calls of float8_min, I wonder why you did not
touch the bounding-box calculations in box_interpt_lseg() or
boxes_bound_box().

4. The line changes feel a bit weird, like there's no clear idea
of what a "valid" or "invalid" line is. For instance the first
hunk in line_construct():

+ /* Avoid creating a valid line from an invalid point */
+ if (unlikely(isnan(pt->y)))
+ result->C = get_float8_nan();

Why's it appropriate to set C and only C to NaN?

5. But actually there's a bigger issue with that particular hunk.
This code branch is dealing with "draw a vertical line through this
point", so why should we care what the point's y coordinate is --- that
is, why is this particular change appropriate at all? The usual rule as
I understand it is that if a function's result is determined by some of
its arguments independently of what another argument's value is, then it
doesn't matter if that one is NaN, you can still return the same result.

6. I'm a bit uncomfortable with the use of "bool isnan" in a couple
of places. I think it's confusing to use that alongside the isnan()
macro. Moreover, it's at least possible that some platforms implement
isnan() in a way that would break this usage. The C spec specifically
says that isnan() is a macro not a function ... but it doesn't commit
to it being a macro-with-arguments. I think "anynan" or something
like that would be a better choice of name.

[ a bit later... ] Indeed, I get a compile failure on gaur:

geo_ops.c: In function 'lseg_closept_lseg':
geo_ops.c:2906:17: error: called object 'isnan' is not a function
geo_ops.c:2906:32: error: called object 'isnan' is not a function
geo_ops.c:2916:16: error: called object 'isnan' is not a function
geo_ops.c:2924:16: error: called object 'isnan' is not a function
geo_ops.c: In function 'box_closept_point':
geo_ops.c:2989:16: error: called object 'isnan' is not a function
geo_ops.c:2992:16: error: called object 'isnan' is not a function
geo_ops.c:3004:16: error: called object 'isnan' is not a function
geo_ops.c:3014:16: error: called object 'isnan' is not a function
make: *** [geo_ops.o] Error 1

So that scenario isn't hypothetical. Please rename the variables.

regards, tom lane

Attachment Content-Type Size
v4-0001-add-more-point-tests.patch text/x-diff 66.3 KB
v4-0002-fix-geometric-nan-handling.patch text/x-diff 49.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jacob Champion 2020-11-10 19:38:16 Re: [PATCH] Support negative indexes in split_part
Previous Message Tomas Vondra 2020-11-10 18:52:48 Re: Windows regress fails (latest HEAD)