Help analyzing 7.2.4 EXPLAIN

From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Pgsql-Performance <pgsql-performance(at)postgresql(dot)org>
Subject: Help analyzing 7.2.4 EXPLAIN
Date: 2003-04-10 00:15:19
Message-ID: 200304091715.19565.josh@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Folks,

What follows is a 7.2.4 EXPLAIN ANALYZE statement for the shown query. This
query is currently taking 570 msec, an OK amount of time until you realize
that the system only has test data currently, and the tables in full
production will have 100-1000 times as much data.

Becuase it's 7.2.4, it's a little hard to tell exactly which part of the
query is taking up 90% of the processing time. The line which claims to be
taking that time is:
-> Seq Scan on users (cost=0.00..3595.33 rows=12 width=87) (actual
time=13.50..547.59 rows=41 loops=1)

However, since users has only 200 records, I suspect that what's actually
being represented here is the processing time for the PL/pgSQL procedure in
the correlated subselect, if_addendee_conflict().

Questions:
1. Can anyone confirm my analysis in the paragraph above?
2. Can anyone point out any obvious ways to speed up the query below?
3. In the query below, if_attendee_conflict needs to be run once for each
(attorney times events) on the same day. Further, if_attendee_conflict
involves a database lookup in 10-25% of cases. Given that
if_attendee_conflict needs to apply complex criteria to determine whether or
not there is a conflict, can anyone suggest possible ways to cut down on the
number of required loops?

Thanks everyone! Query and analyze follows.

j_test=> explain analyze
SELECT users.user_id, (users.fname || COALESCE(' ' || users.minit, '') || ' '
|| users.lname) as atty_name,
users.lname,
(SELECT if_addendee_conflict(users.user_id, 3272, '2003-04-15 10:00', '1
days'::INTERVAL,
events.event_id, events.event_date, events.duration, event_cats.status, '30
minutes') as cflt
FROM events, event_types, event_cats, event_days
WHERE events.event_id = event_days.event_id
and events.etype_id = event_types.etype_id
AND event_types.ecat_id = event_cats.ecat_id
AND event_days.event_day
BETWEEN '2003-04-15' AND '2003-04-16 10:00'
ORDER BY cflt LIMIT 1) AS conflict
FROM users
WHERE EXISTS (SELECT teams_users.user_id FROM teams_users JOIN teams_tree
ON teams_users.team_id = teams_tree.team_id WHERE teams_tree.treeno
BETWEEN 3 and 4 AND teams_users.user_id = users.user_id)
AND users.status > 0
AND NOT EXISTS (SELECT staff_id FROM event_staff WHERE event_id = 3272
AND staff_id = users.user_id)
ORDER BY conflict, users.lname, atty_name;

NOTICE: QUERY PLAN:

Sort (cost=3595.55..3595.55 rows=12 width=87) (actual time=547.89..547.91
rows=41 loops=1)
-> Seq Scan on users (cost=0.00..3595.33 rows=12 width=87) (actual
time=13.50..547.59 rows=41 loops=1)
SubPlan
-> Limit (cost=54.03..54.03 rows=1 width=46) (actual
time=13.14..13.14 rows=1 loops=41)
-> Sort (cost=54.03..54.03 rows=1 width=46) (actual
time=13.13..13.13 rows=2 loops=41)
-> Hash Join (cost=52.77..54.02 rows=1 width=46)
(actual time=5.09..12.94 rows=95 loops=41)
-> Seq Scan on event_cats (cost=0.00..1.16
rows=16 width=6) (actual time=0.01..0.05 rows=16 loops=41)
-> Hash (cost=52.77..52.77 rows=1 width=40)
(actual time=4.72..4.72 rows=0 loops=41)
-> Hash Join (cost=49.94..52.77 rows=1
width=40) (actual time=4.19..4.59 rows=95 loops=41)
-> Seq Scan on event_types
(cost=0.00..2.54 rows=54 width=8) (actual time=0.01..0.12 rows=54 loops=41)
-> Hash (cost=49.93..49.93 rows=5
width=32) (actual time=4.10..4.10 rows=0 loops=41)
-> Nested Loop
(cost=0.00..49.93 rows=5 width=32) (actual time=0.16..3.95 rows=95 loops=41)
-> Seq Scan on event_days
(cost=0.00..25.00 rows=5 width=4) (actual time=0.12..2.31 rows=95 loops=41)
-> Index Scan using
events_pkey on events (cost=0.00..4.97 rows=1 width=28) (actual
time=0.01..0.01 rows=1 loops=3895)
-> Nested Loop (cost=0.00..19.47 rows=1 width=12) (actual
time=0.04..0.04 rows=0 loops=147)
-> Index Scan using idx_teams_tree_node on teams_tree
(cost=0.00..8.58 rows=2 width=4) (actual time=0.01..0.02 rows=2 loops=147)
-> Index Scan using teams_users_pk on teams_users
(cost=0.00..4.83 rows=1 width=8) (actual time=0.01..0.01 rows=0 loops=252)
-> Index Scan using event_staff_table_pk on event_staff
(cost=0.00..4.95 rows=1 width=4) (actual time=0.01..0.01 rows=0 loops=41)
Total runtime: 548.20 msec

--
-Josh Berkus
Aglio Database Solutions
San Francisco

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2003-04-10 00:20:37 Caching (was Re: choosing the right platform)
Previous Message Jim C. Nasby 2003-04-09 23:58:43 Re: choosing the right platform