Re: query speed question

From: Christopher Condit <condit(at)sdsc(dot)edu>
To: Bill Moran <wmoran(at)potentialtech(dot)com>
Cc: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: query speed question
Date: 2009-09-03 23:21:48
Message-ID: BAFDED91EF7D5A43A30F897D6C94B1284D22EA56DE@MBX.ad.sdsc.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

> > I have two tables that are georeferenced (although in this case I'm
> not using PostGIS) that I need to join.
> > A ( lat | lon | depth | value)
> > |A| = 1,100,000
> >
> > B ( lat | lon | attributes)
> > |B| = 14,000,000
> >
> > A is a special case because the lat / lon values are all at half
> degree intervals (ie 33.5, -12.5). The lat/lons in B are arbitrary.
> > I've written a function in B called getSpecialLat(latitude) and
> getSpecialLon(longitude) to calculate the correct A latitude and built
> an index on both functions.
> >
> > Here's the query that I'm trying, but it's rather slow:
> > SELECT B.* FROM B,
> > (SELECT lat, lon FROM A WHERE value > 0 AND value < 2 AND depth = 0)
> AS foo
> > WHERE getSpecialLat(B.lat) = foo.lat AND getSpecialLon(B.lon) =
> foo.lon
> >
> > "Nested Loop (cost=3569.88..32055.02 rows=1414 width=422)"
> > " -> Index Scan using A_valueidx on A (cost=0.00..555.26 rows=6
> width=16)"
> > " Index Cond: ((value > 0) AND (value < 2))"
> > " Filter: (depth = 0)"
> > " -> Bitmap Heap Scan on B (cost=3569.88..5029.48 rows=424
> width=422)"
> > " Recheck Cond: ((getSpecialLon((B.lon)::numeric) = A.lon) AND
> (getSpecialLat((B.lat)::numeric) = A.lat))"
> > " -> BitmapAnd (cost=3569.88..3569.88 rows=424 width=0)"
> > " -> Bitmap Index Scan on Blonidx (cost=0.00..1760.38
> rows=84859 width=0)"
> > " Index Cond: (getSpecialLon((B.lon)::numeric) =
> A.lon)"
> > " -> Bitmap Index Scan on Blatidx (cost=0.00..1766.81
> rows=84859 width=0)"
> > " Index Cond:
> (getSpeicalLat((B.latitude)::numeric) = A.lat)"
> >
> > Am I missing something in terms of speeding up this query?
>
> I'd be interested to see if the query rewritten as a JOIN would be
> faster.

I can write it like this:
select b.*
from b join a on (getwoalatitude(b.latitude::numeric) = a.lat
and getwoalongitude(b.longitude::numeric) = a.lon)
where
a.value > 0 and a.value < 2 and a.depth = 0

which results in this plan:
"Nested Loop (cost=1387.20..13152982.35 rows=1625767 width=422)"
" -> Index Scan using a_depthidx on a_(cost=0.00..1464.07 rows=6897 width=16)"
" Index Cond: (depth = 0)"
" Filter: ((value > 0::numeric) AND (value < 2::numeric))"
" -> Bitmap Heap Scan on b (cost=1387.20..1686.37 rows=424 width=422)"
" Recheck Cond: ((getSpecialLon((b.lon)::numeric) = a.lon) AND (getSpecialLat((b.lat)::numeric) = a.lat))"
" -> BitmapAnd (cost=1387.20..1387.20 rows=424 width=0)"
" -> Bitmap Index Scan on Blonidx (cost=0.00..672.15 rows=84859 width=0)"
" Index Cond: (getSpecialLon((b.lon)::numeric) = a.lon)"
" -> Bitmap Index Scan on Blatidx (cost=0.00..672.36 rows=84859 width=0)"
" Index Cond: (getSpecialLat((b.lat)::numeric) = a.lat)"

However it's still taking ages to execute (over five minutes - I stopped it before it finished)

-Chris

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Sergey Samokhin 2009-09-04 00:10:33 How do I store tables on a remote host?
Previous Message David Fetter 2009-09-03 23:00:21 Re: Regex substring help