Skip site navigation (1) Skip section navigation (2)

Re: Slow query: Select all buildings that have >1 pharmacies and >1 schools within 1000m

From: Craig James <cjames(at)emolecules(dot)com>
To: Stefan Keller <sfkeller(at)gmail(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Slow query: Select all buildings that have >1 pharmacies and >1 schools within 1000m
Date: 2012-08-07 14:37:42
Message-ID: CAFwQ8rd0eMcX45HmMTGqwbW==nqMovN=cs2MZYbu7SGGqppBHA@mail.gmail.com (view raw or flat)
Thread:
Lists: pgsql-performance
On Tue, Aug 7, 2012 at 5:01 AM, Stefan Keller <sfkeller(at)gmail(dot)com> wrote:

> Hi
>
> I have an interesting query to be optimized related to this one [1].
>
> The query definition is: Select all buildings that have more than 1
> pharmacies and more than 1 schools within a radius of 1000m.
>
> The problem is that I think that this query is inherently O(n^2). In
> fact the solution I propose below takes forever...
>

Maybe you could get rid of the O(n^2) aspect like this:

   Select all buildings that have more than 1
   pharmacies and more than 1 schools within a radius of 1000m
   from
      (Select all buildings that have more than four (pharmacy or school)
        within a radius of 1000m)

The inner select should be fast -- you could make it fast by creating a new
property like "building of interest" that was "pharmacy or school" and
build an index on the "building of interest" property.

The inner query would reduce your sample set to a much smaller set of
buildings, and presumably the outer query could handle that pretty quickly.

Craig James


>
> My questions:
>
> 1. Any comments about the nature of this problem?
>
> 2. ... on how to speed it up ?
>
> 3. In the original query [1] there's a count which contains a
> subquery. According to my tests PostgreSQL does not allow this despite
> the documentation which says "count(expression)".
>
> Remarks: I know that "count(*)" could be faster on PostgreSQL but
> "count(osm_id)" does not change the query plan and this does not seem
> to be the bottleneck here anyway.
>
> Yours, S.
>
> [1]
> http://gis.stackexchange.com/questions/11445/selecting-pois-around-specific-buildings-using-postgis
>
>
> Here's my query:
>
> -- Select all buildings that have >1 pharmacies and >1 schools within
> 1000m:
> SELECT osm_id AS building_id
> FROM
>   (SELECT osm_id, way
>    FROM osm_polygon
>    WHERE tags @> hstore('building','yes')
>   ) AS b
> WHERE
>  (SELECT count(*) > 1
>   FROM osm_poi AS p
>   WHERE p.tags @> hstore('amenity','pharmacy')
>   AND ST_DWithin(b.way,p.way,1000)
>  )
>  AND
>  (SELECT count(*) > 1
>   FROM osm_poi AS p
>   WHERE p.tags @> hstore('amenity','school')
>   AND ST_DWithin(b.way,p.way,1000)
>  )
> -- Total query runtime: 4308488 ms. 66345 rows retrieved.
>
> Here's the query plan (from EXPLAIN):
> "Index Scan using osm_polygon_tags_idx on osm_polygon
> (cost=0.00..406812.81 rows=188 width=901)"
> "  Index Cond: (tags @> '"building"=>"yes"'::hstore)"
> "  Filter: ((SubPlan 1) AND (SubPlan 2))"
> "  SubPlan 1"
> "    ->  Aggregate  (cost=269.19..269.20 rows=1 width=0)"
> "          ->  Bitmap Heap Scan on osm_poi p  (cost=7.76..269.19
> rows=1 width=0)"
> "                Recheck Cond: (way && st_expand(osm_polygon.way,
> 1000::double precision))"
> "                Filter: ((tags @> '"amenity"=>"pharmacy"'::hstore)
> AND (osm_polygon.way && st_expand(way, 1000::double precision)) AND
> _st_dwithin(osm_polygon.way, way, 1000::double precision))"
> "                ->  Bitmap Index Scan on osm_poi_way_idx
> (cost=0.00..7.76 rows=62 width=0)"
> "                      Index Cond: (way && st_expand(osm_polygon.way,
> 1000::double precision))"
> "  SubPlan 2"
> "    ->  Aggregate  (cost=269.19..269.20 rows=1 width=0)"
> "          ->  Bitmap Heap Scan on osm_poi p  (cost=7.76..269.19
> rows=1 width=0)"
> "                Recheck Cond: (way && st_expand(osm_polygon.way,
> 1000::double precision))"
> "                Filter: ((tags @> '"amenity"=>"school"'::hstore) AND
> (osm_polygon.way && st_expand(way, 1000::double precision)) AND
> _st_dwithin(osm_polygon.way, way, 1000::double precision))"
> "                ->  Bitmap Index Scan on osm_poi_way_idx
> (cost=0.00..7.76 rows=62 width=0)"
> "                      Index Cond: (way && st_expand(osm_polygon.way,
> 1000::double precision))"
>
> ***
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>

In response to

Responses

pgsql-performance by date

Next:From: Jeff JanesDate: 2012-08-07 16:00:55
Subject: Re: Sequential scan instead of index scan
Previous:From: Tomas VondraDate: 2012-08-07 13:56:46
Subject: Re: Slow query: Select all buildings that have >1 pharmacies and >1 schools within 1000m

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group