Floating point comparison inconsistencies of the geometric types

From: Emre Hasegeli <emre(at)hasegeli(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andreas Karlsson <andreas(at)proxel(dot)se>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Floating point comparison inconsistencies of the geometric types
Date: 2016-05-27 10:43:02
Message-ID: CAE2gYzwwxPWbzxY3mtN4WL7W0DCkWo8gnB2ThUHU2XQ9XwgHMg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

There are those macros defined for the built-in geometric types:

> #define EPSILON 1.0E-06

> #define FPzero(A) (fabs(A) <= EPSILON)
> #define FPeq(A,B) (fabs((A) - (B)) <= EPSILON)
> #define FPne(A,B) (fabs((A) - (B)) > EPSILON)
> #define FPlt(A,B) ((B) - (A) > EPSILON)
> #define FPle(A,B) ((A) - (B) <= EPSILON)
> #define FPgt(A,B) ((A) - (B) > EPSILON)
> #define FPge(A,B) ((B) - (A) <= EPSILON)

with this warning:

> * XXX These routines were not written by a numerical analyst.

Most of the geometric operators use those macros for comparison, but
those do not:

* polygon << polygon
* polygon &< polygon
* polygon &> polygon
* polygon >> polygon
* polygon <<| polygon
* polygon &<| polygon
* polygon |&> polygon
* polygon |>> polygon
* box @> point
* point <@ box
* lseg <@ box
* circle @> point
* point <@ circle

This is really a bug that needs to be fixed one way or another. I think
that it is better to fix it by removing the macros all together. I
am not sure how useful they are in practice. I haven't seen anyone
complaining about the above operators not using the macros. Though,
people often complain about the ones using the macros and the problems
caused by them.

The hackers evidently don't like the macros, either. That should be
why they are not used on the new operators. What annoys me most about
this situation is the inconsistency blocks demonstrating our indexes.
Because of this, we had to rip out point type support from BRIN
inclusion operator class which could be useful to PostGIS.

Fixing it has been discussed many times before [1][2][3][4] with
no result. Here is my plan to fix the situation covering the problems
around it:

1) Reimplement some operators to avoid divisions

The attached patch does it on some of the line operators.

2) Use exact comparison on everywhere except the operators fuzzy
comparison is certainly required

The attach patch changes all operators except some "lseg" operators.
"lseg" stores two points on a line. Most of the operations done on it
are lossy. I don't see a problem treating them differently, those
operators are very unlikely to be index supported.

3) Check numbers for underflow and overflow

I am thinking to use CHECKFLOATVAL on utils/adt/float.c, but it is not
exposed outside at the moment. Would it be okay to create a new header
file utils/float.h to increase code sharing between float and geometric
types?

4) Implement relative fuzzy comparison for the remaining operators

It is better to completely get rid of those macros while we are on it.
I think we can get away by implementing just a single function for fuzzy
equality, not for other comparisons. I am inclined to put it to
utils/adt/float.c. Is this a good idea?

Tom Lane commented on the function posted to the list [1] on 2002:

> Not like that. Perhaps use a fraction of the absolute value of the
> one with larger absolute value. As-is it's hard to tell how FLT_EPSILON
> is measured.

I cannot image how the function would look like. I would appreciate
any guidance.

5) Implement default hash operator class for all geometric types

This would solve most complained problem of those types allowing them
to used with DISTINCT and GROUP BY.

6) Implement default btree operator class at least for the point type

This would let the point type to be used with ORDER BY. It is
relatively straight forward to implement it for the point type. Is it
a good idea to somehow implement it for other types?

7) Add GiST index support for missing cross type operators

Currently only contained by operators are supported by an out-of-range
strategy number. I think we can make the operator classes much nicer
by allowing really cross type operator families.

Comments?

[1] https://www.postgresql.org/message-id/flat/D90A5A6C612A39408103E6ECDD77B8290FD4E3(at)voyager(dot)corporate(dot)connx(dot)com
[2] https://www.postgresql.org/message-id/flat/4A7C2C4B(dot)5020508(at)netspace(dot)net(dot)au
[3] https://www.postgresql.org/message-id/flat/12549(dot)1346111029(at)sss(dot)pgh(dot)pa(dot)us
[4] https://www.postgresql.org/message-id/flat/20150512181307(dot)GJ2523(at)alvh(dot)no-ip(dot)org

Attachment Content-Type Size
0001-geo-ops-fpcomp-v01.patch text/x-diff 48.2 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2016-05-27 11:53:38 COMMENT ON, psql and access methods
Previous Message Craig Ringer 2016-05-27 10:02:49 Re: Allow COPY to use parameters