Skip site navigation (1) Skip section navigation (2)

Re: find close (duplicate) points + create index

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Elinor Medezinski <elinor(at)bellatrix(dot)tau(dot)ac(dot)il>
Cc: pgsql-novice(at)postgresql(dot)org
Subject: Re: find close (duplicate) points + create index
Date: 2004-03-10 14:58:22
Message-ID: 6492.1078930702@sss.pgh.pa.us (view raw or flat)
Thread:
Lists: pgsql-novice
Elinor Medezinski <elinor(at)bellatrix(dot)tau(dot)ac(dot)il> writes:
> And then I found out that in postgres the only operator classes defined for 
> rtree indexes are: bigbox_ops, box_ops and poly_ops. Neither of which works 
> with points, only with type box and polygon. Therefore I also have to create 
> an operator class.

No you don't.  What you want is a functional index built on a box or polygon
surrounding the point.  For instance, given

regression=# create table p1 (point_a point);
CREATE TABLE
regression=# create index p1i on p1 using rtree (box(point_a, point_a));
CREATE INDEX

you could do searches for points enclosed in a specific box like this:

regression=# explain select * from p1 where box(point_a, point_a) && '(0,1),(0,1)'::box;
                           QUERY PLAN
----------------------------------------------------------------
 Index Scan using p1i on p1  (cost=0.00..17.07 rows=5 width=16)
   Index Cond: (box(point_a, point_a) && '(0,1),(0,1)'::box)
(2 rows)

since box-overlap (&&) is one of the rtree-indexable operators.

The most useful way to solve your original problem seems to be

regression=# explain select * from p1 a, p1 b where
regression-# box(a.point_a, a.point_a) && box(circle(b.point_a,sqrt(2)))
regression-# and (a.point_a <-> b.point_a) <= 1;
                                                     QUERY PLAN
--------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=0.01..17220.00 rows=1667 width=32)
   Join Filter: (("inner".point_a <-> "outer".point_a) <= 1::double precision)
   ->  Seq Scan on p1 b  (cost=0.00..20.00 rows=1000 width=16)
   ->  Index Scan using p1i on p1 a  (cost=0.01..17.07 rows=5 width=16)
         Index Cond: (box(a.point_a, a.point_a) && box(circle("outer".point_a, 1.4142135623731::double precision)))
(5 rows)

The indexable condition finds "a" rows that are within the bounding box
of a circle surrounding the "b" row, and then we only need to apply the
exact distance check to those rows.

(If you're wondering about the sqrt(2), there's an oddity in the
built-in circle-to-box function: it divides the circle radius
by sqrt(2).  I think this is a bug and will propose changing it
for 7.5.)

			regards, tom lane

In response to

Responses

pgsql-novice by date

Next:From: Costin MandaDate: 2004-03-10 15:14:15
Subject: I am trying to send a message on the list for the last three hours
Previous:From: MohanDate: 2004-03-10 13:08:40
Subject: Re: local usage was: Re: using pgsql on my comp only without

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group