Help optimizing a slow index scan

From: Dan Harris <fbsd(at)drivefaster(dot)net>
To: PostgreSQL Performance <pgsql-performance(at)postgresql(dot)org>
Subject: Help optimizing a slow index scan
Date: 2006-03-16 21:40:32
Message-ID: 4419DB50.10404@drivefaster.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

explain analyze
select distinct eventmain.incidentid, eventmain.entrydate,
eventgeo.long, eventgeo.lat, eventgeo.geox, eventgeo.geoy
from eventmain, eventgeo
where
eventmain.incidentid = eventgeo.incidentid and
( long > -104.998027962962 and long < -104.985957781349 ) and
( lat > 39.7075542720006 and lat < 39.7186195832938 ) and
eventmain.entrydate > '2006-1-1 00:00' and
eventmain.entrydate <= '2006-3-17 00:00'
order by
eventmain.entrydate;

QUERY
PLAN

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Unique (cost=121313.81..121330.72 rows=451 width=178) (actual
time=723719.761..723726.875 rows=1408 loops=1)
-> Sort (cost=121313.81..121314.94 rows=451 width=178) (actual
time=723719.755..723721.807 rows=1408 loops=1)
Sort Key: eventmain.entrydate, eventmain.disposition,
eventmain.incidentid, eventgeo.reportingarea, eventgeo.beatid,
eventmain.finaltype, eventmain.casenumber, eventgeo.eventlocation,
eventmain.insertdate, eventmain.priority, eventgeo.long, eventgeo.lat,
eventgeo.geox, eventgeo.geoy
-> Nested Loop (cost=0.00..121293.93 rows=451 width=178)
(actual time=1916.230..723712.900 rows=1408 loops=1)
-> Index Scan using eventgeo_lat_idx on eventgeo
(cost=0.00..85488.05 rows=10149 width=76) (actual time=0.402..393376.129
rows=22937 loops=1)
Index Cond: ((lat > 39.7075542720006::double
precision) AND (lat < 39.7186195832938::double precision))
Filter: ((long > -104.998027962962::double
precision) AND (long < -104.985957781349::double precision))
-> Index Scan using eventmain_incidentid_idx on
eventmain (cost=0.00..3.52 rows=1 width=119) (actual
time=14.384..14.392 rows=0 loops=22937)
Index Cond: ((eventmain.incidentid)::text =
("outer".incidentid)::text)
Filter: ((entrydate > '2006-01-01
00:00:00'::timestamp without time zone) AND (entrydate <= '2006-03-17
00:00:00'::timestamp without time zone))

Total runtime: >>> 723729.238 ms(!) <<<

I'm trying to figure out why it's consuming so much time on the index
scan for eventgeo_lat_idx. Also, I have an index on "long" that the
planner does not appear to find helpful.

There are 3.3 million records in eventmain and eventgeo. The server has
a reasonably fast RAID10 setup with 16x 15k RPM drives and 12GB of RAM (
11GB listed as "cache" by vmstat ). Running version 8.0.2 on linux
kernel 2.6.12.

I have just vacuum analyze'd both tables, rebuilt the eventgeo_lat_idx
index and reran the query multiple times to see if caching helped ( it
didn't help much ). The server seems to be fine utilizing other fields
from this table but using "long" and "lat" seem to drag it down
significantly.

Is it because there's such slight differences between the records,
since they are all within a few hundredths of a degree from each other?

Thanks for your time and ideas.

-Dan

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Dan Harris 2006-03-16 21:43:56 Re: Help optimizing a slow index scan
Previous Message Matthew Nuzum 2006-03-16 20:46:26 Re: 1 TB of memory