Re: cube operations slower than geo_distance() on production server

From: Mark Stosberg <mark(at)summersault(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: cube operations slower than geo_distance() on production server
Date: 2007-02-12 19:48:58
Message-ID: eqqgc5$kle$1@news.hub.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance


Merlin--

Thanks so much for your help. Some follow-ups are below.

Merlin Moncure wrote:
>
>> Here the basic query I'm using:
>> SELECT
>> -- 1609.344 is a constant for "meters per mile"
>> cube_distance( (SELECT earth_coords from zipcodes WHERE zipcode =
>> '90210') , earth_coords)/1609.344
>> AS RADIUS
>> FROM pets
>> -- "shelters_active" is a view where "shelter_state = 'active'"
>> JOIN shelters_active as shelters USING (shelter_id)
>> -- The zipcode fields here are varchars
>> JOIN zipcodes ON (
>> shelters.postal_code_for_joining = zipcodes.zipcode )
>> -- search for just 'dogs'
>> WHERE species_id = 1
>> AND pet_state='available'
>> AND earth_box(
>> (SELECT earth_coords from zipcodes WHERE zipcode = '90210') ,
>> 10*1609.344
>> ) @ earth_coords
>> ORDER BY RADIUS;
>
> your query looks a bit funky. here are the problems I see.
>
> * in your field list, you don't need to re-query the zipcode table.
>> cube_distance( (SELECT earth_coords from zipcodes WHERE zipcode =
>> '90210') , earth_coords)/1609.344 AS RADIUS
>
> becomes
>
> cube_distance(pets.earth_coords, earth_coords ) / 1609.344 AS RADIUS

It may not have been clear from the query, but only the 'zipcodes' table
has an 'earth_coords' column. Also, I think your refactoring means
something different. My query expresses "number of miles this pet is
from 90210", while I think the refactor expresses a distance between a
pet and another calculated value.

> also, dont. re-refer to the zipcodes table in the join clause. you are
> already joining to it:
>> AND earth_box(
>> (SELECT earth_coords from zipcodes WHERE zipcode = '90210') ,
>> 10*1609.344) @ earth_coords
>
> becomes
>
> AND earth_box(zipcodes.earth_coords, 10*1609.344) ) @ pets.earth_coords

I have the same question here as above-- I don't see how the new syntax
includes the logic of "distance from the 90210 zipcode".

> * also, does pet_state have any other states than 'available' and '
> not available'? if not, you should be using a boolean. if so, you can
> consider a functional index to convert it to a booelan.

Yes, it has three states.

> * if you always look up pets by species, we can explore composite
> index columns on species, available (especially using the above
> functional suggestion), etc. composite > partial (imo)

We nearly always search by species. Right now it's mostly dogs and some
cats. I searched for references to composite index columns, and didn't
find much. Could you provide a direct reference to what you have in
mind?

Any other ideas appreciated!

Mark

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Brian Cox 2007-02-12 21:36:45 limit + order by is slow if no rows in result set
Previous Message Merlin Moncure 2007-02-12 19:05:18 Re: cube operations slower than geo_distance() on production server