Re: gist indexes for distance calculations

From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Marcelo Zabani <mzabani(at)gmail(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: gist indexes for distance calculations
Date: 2010-10-01 16:04:19
Message-ID: AANLkTinYKnYgT+OVdtssfFs4qjJjBTEYfrMHp7XaO77s@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Thu, Sep 30, 2010 at 2:33 PM, Marcelo Zabani <mzabani(at)gmail(dot)com> wrote:
> Hi everyone. I have a question, and it's well beyond me to even speculate
> about the inner workings of postgresql on this.
>
> I have a "places" table, and a "coordinates" column, of type POINT.
>
> If I want to find every place within, for example, a distance of 1 unit from
> an arbitrary point, I'll do:
>
> CREATE INDEX ON places USING gist (circle(coordinates, 1));
>
> And then I'll fetch the desired rows like this:
>
> SELECT * FROM places WHERE circle(coordinates, 1) @> circle('(a,b)', 0);
> (where (a,b) is an arbitrary point)
>
> I'd like to know how this index works, though, as it seems to me the only
> way to have this kind of index to work is to calculate the distance of every
> point in a square of sides 2*1=2 units centered on (a, b).
>
> So, am I wrong to think it works like that? If it does work like that, could
> I have instead two columns of type FLOAT (xcoordinate and ycoordinate) and
> create traditional b-tree indexes on both of these, and then do something
> like:
>
> SELECT * FROM places WHERE xcoordinate >= (a-1) AND xcoordinate <= (a+1) AND
> ycoordinate >= (b-1) AND ycoordinate <= (b+1) And
> SQRT(POW(a-xcoordinate,2)+POW(b-ycoordinate,2))<=1;
>
> If you can also pinpoint me to where I can find this sort of information
> (index utilization and planning, performance tuning), I'd be very grateful.

A quick heads up: It's possible, although it may not necessarily help,
to further reduce distance calcs by drawing an inner bounding box of
points that are confirmed good. Your outer box is made by squaring
the circle on lat/lon projection -- you can also calculate the biggest
lat lon 'rectangle' that completely fits inside the circle, and play
with a query that looks something like this (pseudo sql):

select * from points where (point inside good box) or (point inside
possible box and dist(point, mypoint < n));

You get reduction of dist calcs at expense of second gist lookup. You
can also, of course, do this on application side, but what's the fun
in that? :-).

merlin

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Samuel Gendler 2010-10-01 16:50:39 Re: How does PG know if data is in memory?
Previous Message Tom Lane 2010-10-01 14:40:11 Re: How does PG know if data is in memory?