Re: horrendous query challenge :-)

From: Fran Fabrizio <ffabrizio(at)mmrd(dot)com>
To: Shaun Thomas <sthomas(at)townnews(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: horrendous query challenge :-)
Date: 2002-05-30 17:34:12
Message-ID: 3CF66294.9070806@mmrd.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general


Ok, I've taken Shaun and Tom's advice, and I've also loaded the schema
and data into a 7.2 devel machine in order to produce EXPLAIN ANALYZE
output.

Shaun and Tom pointed out that I could remove the IN, and Shaun also
showed me that including the view in the join was unnecessary, so we
ended up with the final query of:

SELECT wm.entity_id, e.type, e.name, w.interface_label,
wm.last_contact AS remote_ts, s.name, r.name
FROM entity_watch_map wm, entity e, site s,
region r, watch w
WHERE wm.last_contact > "timestamp"(now() - 180)
AND wm.current = false
AND wm.msg_type = w.msg_type
AND wm.entity_id = e.entity_id
AND e.active = true
AND findsite(wm.entity_id) = s.site_id
AND s.region_id = r.region_id
ORDER BY wm.last_contact desc, r.name, s.name;

I created indexes on all the *_id columns as well as the msg_type
columns, but as you can see from the output below, they aren't being
used (these tables are small so that doesn't surprise me).

Performance is about 2-3x's better, but still taking around 40s. Now
that we've eliminated the view and it's function calls, and the IN, this
seems pretty straightforward. My next stop was to check on the one
remaining function call, the findsite() in the where clause. But, I
have spent about a week's time in the past working on that one and it's
fairly well optimized, down to about .003 seconds per call. Small bit
of evidence:

monitor=# select now(); select findsite(10950); select now();
now
-------------------------------
2002-05-30 13:16:50.519125-04
(1 row)

findsite
----------
10132
(1 row)

now
-------------------------------
2002-05-30 13:16:50.522812-04
(1 row)

monitor=#

Calling it with several other values produces the same results. So I'm
not sure that's the bottleneck. Then, I tried to read through the
EXPLAIN ANALYZE output of the new and improved query, and it looks like
the hit is coming on the nested loop. So I thought maybe setting
enable_nestloop = false would show something interesting, so I edit
postgresql.conf, restart, run again, and get the same exact EXPLAIN
ANALYZE output (showing that it's still doing a nested loop anyhow).

At this point, I'm past my expertise. Are there other areas of this
query I can look to improve upon? Here is the EXPLAIN ANALYZE output.
The row counts are accurate to the reality, too (Tom had asked).

monitor=# explain analyze SELECT wm.entity_id, e.type, e.name,
w.interface_label,
monitor-# wm.last_contact AS remote_ts, s.name, r.name
monitor-# FROM entity_watch_map wm, entity e, site s,
monitor-# region r, watch w
monitor-# WHERE wm.last_contact > "timestamp"(now() - 180)
monitor-# AND wm.current = false
monitor-# AND wm.msg_type = w.msg_type
monitor-# AND wm.entity_id = e.entity_id
monitor-# AND e.active = true
monitor-# AND findsite(wm.entity_id) = s.site_id
monitor-# AND s.region_id = r.region_id
monitor-# ORDER BY wm.last_contact desc, r.name, s.name;
NOTICE: QUERY PLAN:

Sort (cost=469.67..469.67 rows=39 width=102) (actual
time=38170.90..38171.06 rows=202 loops=1)
-> Hash Join (cost=72.86..468.65 rows=39 width=102) (actual
time=222.59..38163.46 rows=202 loops=1)
-> Hash Join (cost=71.49..466.08 rows=36 width=84) (actual
time=222.25..38160.23 rows=202 loops=1)
-> Nested Loop (cost=70.40..464.37 rows=36 width=69)
(actual time=222.03..38156.67 rows=202 loops=1)
-> Hash Join (cost=70.40..100.27 rows=47 width=38)
(actual time=33.58..41.06 rows=202 loops=1)
-> Seq Scan on entity e (cost=0.00..19.02
rows=513 width=22) (actual time=0.13..4.52 rows=513 loops=1)
-> Hash (cost=70.20..70.20 rows=80 width=16)
(actual time=32.98..32.98 rows=0 loops=1)
-> Seq Scan on entity_watch_map wm
(cost=0.00..70.20 rows=80 width=16) (actual time=0.15..32.36 rows=240
loops=1)
-> Seq Scan on site s (cost=0.00..5.53 rows=153
width=31) (actual time=0.01..1.02 rows=153 loops=202)
-> Hash (cost=1.07..1.07 rows=7 width=15) (actual
time=0.14..0.14 rows=0 loops=1)
-> Seq Scan on region r (cost=0.00..1.07 rows=7
width=15) (actual time=0.09..0.11 rows=7 loops=1)
-> Hash (cost=1.30..1.30 rows=30 width=18) (actual
time=0.30..0.30 rows=0 loops=1)
-> Seq Scan on watch w (cost=0.00..1.30 rows=30
width=18) (actual time=0.10..0.20 rows=30 loops=1)
Total runtime: 38171.52 msec

EXPLAIN
monitor=#

Thank you for your continued help, I've learned a lot already.

-Fran

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message mpls 2002-05-30 17:44:49 Re: connection refused problem
Previous Message Bill Gribble 2002-05-30 17:30:27 Re: erros when making examples in /src/test/examples