Re: Strange behavior with polygon and NaN

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Jesse Zhang <sbjesse(at)gmail(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Strange behavior with polygon and NaN
Date: 2020-08-25 23:03:50
Message-ID: 20200825230350.GA20669@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


I can confirm that this two-month old email report still produces
different results with indexes on/off in git master, which I don't think
is ever correct behavior.

---------------------------------------------------------------------------

On Wed, Jun 24, 2020 at 03:11:03PM -0700, Jesse Zhang wrote:
> Hi hackers,
>
> While working with Chris Hajas on merging Postgres 12 with Greenplum
> Database we stumbled upon the following strange behavior in the geometry
> type polygon:
>
> ------ >8 --------
>
> CREATE TEMP TABLE foo (p point);
> CREATE INDEX ON foo USING gist(p);
>
> INSERT INTO foo VALUES ('0,0'), ('1,1'), ('NaN,NaN');
>
> SELECT $q$
> SELECT * FROM foo WHERE p <@ polygon '(0,0), (0, 100), (100, 100), (100, 0)'
> $q$ AS qry \gset
>
> BEGIN;
> SAVEPOINT yolo;
> SET LOCAL enable_seqscan TO off;
> :qry;
>
> ROLLBACK TO SAVEPOINT yolo;
> SET LOCAL enable_indexscan TO off;
> SET LOCAL enable_bitmapscan TO off;
> :qry;
>
> ------ 8< --------
>
> If you run the above repro SQL in HEAD (and 12, and likely all older
> versions), you get the following output:
>
> CREATE TABLE
> CREATE INDEX
> INSERT 0 3
> BEGIN
> SAVEPOINT
> SET
> p
> -------
> (0,0)
> (1,1)
> (2 rows)
>
> ROLLBACK
> SET
> SET
> p
> -----------
> (0,0)
> (1,1)
> (NaN,NaN)
> (3 rows)
>
>
> At first glance, you'd think this is the gist AM's bad, but on a second
> thought, something else is strange here. The following query returns
> true:
>
> SELECT point '(NaN, NaN)' <@ polygon '(0,0), (0, 100), (100, 100), (100, 0)'
>
> The above behavior of the "contained in" operator is surprising, and
> it's probably not what the GiST AM is expecting. I took a look at
> point_inside() in geo_ops.c, and it doesn't seem well equipped to handle
> NaN. Similary ill-equipped is dist_ppoly_internal() which underlies the
> distnace operator for polygon. It gives the following interesting
> output:
>
> SELECT *, c <-> polygon '(0,0),(0,100),(100,100),(100,0)' as distance
> FROM (
> SELECT circle(point(100 * i, 'NaN'), 50) AS c
> FROM generate_series(-2, 4) i
> ) t(c)
> ORDER BY 2;
>
> c | distance
> -----------------+----------
> <(-200,NaN),50> | 0
> <(-100,NaN),50> | 0
> <(0,NaN),50> | 0
> <(100,NaN),50> | 0
> <(200,NaN),50> | NaN
> <(300,NaN),50> | NaN
> <(400,NaN),50> | NaN
> (7 rows)
>
> Should they all be NaN? Am I alone in thinking the index is right but
> the operators are wrong? Or should we call the indexes wrong here?
>
> Cheers,
> Jesse and Chris
>
>

--
Bruce Momjian <bruce(at)momjian(dot)us> https://momjian.us
EnterpriseDB https://enterprisedb.com

The usefulness of a cup is in its emptiness, Bruce Lee

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ranier Vilela 2020-08-25 23:10:06 Re: [PATCH] Fix Uninitialized scalar variable (UNINIT) (src/backend/access/heap/heapam_handler.c)
Previous Message Ranier Vilela 2020-08-25 22:54:33 Re: [PATCH] Fix Uninitialized scalar variable (UNINIT) (src/backend/access/heap/heapam_handler.c)