Re: optimizing a geo_distance() proximity query (example and benchmark)

From: "Merlin Moncure" <mmoncure(at)gmail(dot)com>
To: "Mark Stosberg" <mark(at)summersault(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: optimizing a geo_distance() proximity query (example and benchmark)
Date: 2007-02-06 12:58:29
Message-ID: b42b73150702060458v628126bei9ce1eefcafe3aa72@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 2/6/07, Mark Stosberg <mark(at)summersault(dot)com> wrote:
> It's also notable that the units used are meters, not miles like
> geo_distance(). That's what the magic number of "16093.44" is-- 10 miles
> converted to meters.
>
> When I benchmarked this query against the old geo_distance() variation,
> it was about 200 times faster (~100ms vs .5ms).
>
> However, my next step was to try a more "real world" query that involved
> a more complex where clause and a couple of table joins. So far, that
> result is coming out /slower/ with the new approach, even though the
> index is being used. I believe this may be cause of the additional
> results found that are outside of the sphere, but inside the cube. This
> causes additional rows that need processing in the joined tables.
>
> Could someone post an example of how to further refine this so the
> results more closely match what geo_distance returns() ?

I agree with bruno...the extra time is probably not what you are
thinking...please post explain analyze results, etc. However bruno's
ratio, while correct does not tell the whole story because you have to
recheck distance to every point in the returned set.

There is a small optimization you can make. The query you wrote
automatically excludes points within a certain box. you can also
include points in the set which is the largest box that fits in the
circle:

select * from zipcodes
where
earth_box('(436198.322855334,
4878562.8732218, 4085386.43843934)'::cube,inner_radius) @ earth_coords
or
(
earth_box('(436198.322855334,
4878562.8732218, 4085386.43843934)'::cube,16093.44) @ earth_coords
and
geo_dist...
);

you can also choose to omit the earth_coords column and calculate it
on the fly...there is no real performance hit for this but it does
make the sql a bit ugly.

merlin

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Mark Stosberg 2007-02-06 14:39:54 Re: explain analyze output for review (was: optimizing a geo_distance()...)
Previous Message Harald Armin Massa 2007-02-06 10:21:41 Re: Tuning