Re: Floating point comparison inconsistencies of the geometric types

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: emre(at)hasegeli(dot)com
Cc: kgrittn(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org, tgl(at)sss(dot)pgh(dot)pa(dot)us, andreas(at)proxel(dot)se, teodor(at)sigaev(dot)ru, robertmhaas(at)gmail(dot)com, kgrittn(at)ymail(dot)com, Jim(dot)Nasby(at)bluetreble(dot)com, mail(at)joeconway(dot)com
Subject: Re: Floating point comparison inconsistencies of the geometric types
Date: 2017-01-11 05:51:14
Message-ID: 20170111.145114.196298110.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello,

At Fri, 18 Nov 2016 10:58:27 +0100, Emre Hasegeli <emre(at)hasegeli(dot)com> wrote in <CAE2gYzxVxKNS7qU74UdHVZTmfXQjxMbFiXH5+16XFy90SRAbXA(at)mail(dot)gmail(dot)com>
> > To keep such kind of integrity, we should deeply consider about
> > errors.
>
> My point is that I don't think we can keep integrity together with the
> fuzzy behaviour, or at least I don't have the skills to do that. I
> can just leave the more complicated operators like "is a
> point on a line" as it is, and only change the basic ones. Do you
> think a smaller patch like this would be acceptable?

The size of the patch is not a problem. I regret that I haven't
made your requirement clear. So as the startpoint of the new
discussion, I briefly summarize the current implement of
geometric comparisons.

- Floating point comparisons for gemetric types

Comparison related to geometric types is performed by FPeq
macro and similars defined in geo_decls.h. This intends to give
tolerance to the comparisons.

A
FPeq: |<=e-|-e=>| (<= means inclusive, e = epsilon = tolerance)
FPne: ->| e | e |<- (<- means exclusive)
FPlt: | e |<-
FPle: |<=e |
FPgt: ->| e |
FPge: | e=>|

These seems reasonable ignoring the tolerance amount issue.

- Consistency between index and non-index scans.

GIST supports geometric types.

=# create table tpnt1(id int, p point);
=# insert into tpnt1 (select i + 200, point(i*1.0e-6 / 100.0, i * 1.0e-6 / 100.0) from generate_series(-200, 200) as i);
=# create index on tpnt1 using gist (p);
=# set enable_seqscan to false;
=# set enable_bitmapscan to true;
=# select count(*) from tpnt1 where p ~= point(0, 0);
201
=# select count(*) from tpnt1 where p << point(0, 0);
100
=# set enable_seqscan to true;
=# set enable_bitmapscan to false;
=# select count(*) from tpnt1 where p ~= point(0, 0);
201
=# select count(*) from tpnt1 where p << point(0, 0);
100

At least for the point type, (bitmap) index scan is consistent
with sequential scan. I remember that the issue was
"inconsistency between indexed and non-indexed scans over
geometric types" but they seem consistent with each other.

You mentioned b-tree, which don't have predefined opclass for
geometric types. So the "inconsistency" may be mentioning the
difference between geometric values and combinations of plain
(first-class) values. But the two are different things and
apparently using different operators (~= and = for equality) so
the issue is not fit for this. More specifically, "p ~= p0" and
"x = x0 and y = y0" are completely different.

Could you let me (or any other guys on this ml) have more precise
description on the problem and/or what you want to do with this
patch?

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Rafia Sabih 2017-01-11 06:12:08 Passing query string to workers
Previous Message Amit Kapila 2017-01-11 05:33:57 Re: CONNECTION LIMIT and Parallel Query don't play well together