performance hit when joining with a view?

From: Palle Girgensohn <girgen(at)pingpong(dot)net>
To: pgsql-general(at)postgresql(dot)org
Subject: performance hit when joining with a view?
Date: 2003-09-25 12:15:16
Message-ID: 24470000.1064492116@rambutan.pingpong.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi!

A performance question:

I have some tables:

Tabell "public.person"
Kolumn | Typ | Modifierare
------------------+--------------------------+---------------
userid | text | not null
giver | text |
first_name | text |
last_name | text |
email | text |
default_language | text | default 'sv'
created | timestamp with time zone | default now()
created_by | text |
Index: person_pkey primärnyckel btree (userid),
person_index unik btree (userid),
person_giver_idx btree (giver)
Främmande nyckel-villkor: pp_fk9 FOREIGN KEY (giver) REFERENCES
providers(giver) ON UPDATE CASCADE ON DELETE CASCADE,
pp_fk2 FOREIGN KEY (created_by) REFERENCES
person(userid) ON UPDATE CASCADE ON DELETE SET NULL

Tabell "public.wiol"
Kolumn | Typ | Modifierare
-----------------+-----------------------------+---------------
userid | text | not null
course_id | integer |
login_ts | timestamp without time zone | default now()
latest_event_ts | timestamp without time zone | default now()
Främmande nyckel-villkor: pp_fk2 FOREIGN KEY (course_id) REFERENCES
course(id) ON UPDATE CASCADE ON DELETE CASCADE,
pp_fk1 FOREIGN KEY (userid) REFERENCES
person(userid) ON UPDATE CASCADE ON DELETE CASCADE

and a view:

Vy "public.person_wiol_view"
Kolumn | Typ | Modifierare
------------------+--------------------------+-------------
userid | text |
giver | text |
first_name | text |
last_name | text |
email | text |
default_language | text |
created | timestamp with time zone |
created_by | text |
course_id | integer |
Vydefinition: SELECT p.userid, p.giver, p.first_name, p.last_name, p.email,
p.default_language, p.created, p.created_by, w.course_id FROM (person p
LEFT JOIN wiol w ON ((p.userid = w.userid)));

Now, with about 30000 tuples in person and about 40 in wiol, executing a
left outer join with the view gives horrible performance:

explain analyze select p.pim_id, p.recipient, p.sender, p.message, p.ts,
p.type, case when sender.userid is not null then sender.first_name || ' '
|| sender.last_name else null end as sender_name, sender.course_id is not
null as is_online from pim p left outer join person_wiol_view sender on
(sender.userid = p.sender) where p.recipient = 'axto6551' and p.type >= 0
limit 1;
QUERY PLAN

---------------------------------------------------------------------------
----------------------------------------------------------
Limit (cost=0.00..1331.26 rows=1 width=180) (actual time=866.14..1135.65
rows=1 loops=1)
-> Nested Loop (cost=0.00..1331.26 rows=1 width=180) (actual
time=866.13..1135.63 rows=2 loops=1)
Join Filter: ("inner".userid = "outer".sender)
-> Seq Scan on pim p (cost=0.00..0.00 rows=1 width=112) (actual
time=0.05..0.18 rows=2 loops=1)
Filter: ((recipient = 'axto6551'::text) AND ("type" >= 0))
-> Materialize (cost=956.15..956.15 rows=30009 width=68) (actual
time=369.33..437.86 rows=22045 loops=2)
-> Hash Join (cost=0.00..956.15 rows=30009 width=68)
(actual time=0.45..605.21 rows=30013 loops=1)
Hash Cond: ("outer".userid = "inner".userid)
-> Seq Scan on person p (cost=0.00..806.09
rows=30009 width=32) (actual time=0.16..279.28 rows=30009 loops=1)
-> Hash (cost=0.00..0.00 rows=1 width=36) (actual
time=0.13..0.13 rows=0 loops=1)
-> Seq Scan on wiol w (cost=0.00..0.00 rows=1
width=36) (actual time=0.02..0.09 rows=8 loops=1)
Total runtime: 1143.93 msec
(12 rader)

but rewriting the question with an explicit join uses the indices, and runs
*much* faster:

explain analyze select p.pim_id, p.recipient, p.sender, p.message, p.ts,
p.type, case when sender.userid is not null then sender.first_name || ' '
|| sender.last_name else null end as sender_name, w.course_id is not null
as is_online from pim p left outer join person sender on (sender.userid =
p.sender) left join wiol w on (w.userid=sender.userid) where p.recipient =
'axto6551' and p.type >= 0 limit 1;
QUERY PLAN

---------------------------------------------------------------------------
-----------------------------------------------------------------
Limit (cost=0.00..6.03 rows=1 width=180) (actual time=0.89..1.13 rows=1
loops=1)
-> Hash Join (cost=0.00..6.03 rows=1 width=180) (actual
time=0.88..1.12 rows=2 loops=1)
Hash Cond: ("outer".userid = "inner".userid)
-> Nested Loop (cost=0.00..6.02 rows=1 width=144) (actual
time=0.48..0.69 rows=2 loops=1)
-> Seq Scan on pim p (cost=0.00..0.00 rows=1 width=112)
(actual time=0.04..0.16 rows=2 loops=1)
Filter: ((recipient = 'axto6551'::text) AND ("type" >=
0))
-> Index Scan using person_pkey on person sender
(cost=0.00..6.01 rows=1 width=32) (actual time=0.23..0.24 rows=1 loops=2)
Index Cond: (sender.userid = "outer".sender)
-> Hash (cost=0.00..0.00 rows=1 width=36) (actual
time=0.22..0.22 rows=0 loops=1)
-> Seq Scan on wiol w (cost=0.00..0.00 rows=1 width=36)
(actual time=0.12..0.17 rows=8 loops=1)
Total runtime: 1.39 msec
(11 rader)

Tests run on postgresql-7.3.4.

Main question is, is it bad SQL to join with a view, or is it postgresql
that does something not quite optimal? If the latter, is it fixed in 7.4?

Thanks,
Palle

Browse pgsql-general by date

  From Date Subject
Next Message Jan Wieck 2003-09-25 13:36:41 Re: About GPL and proprietary software
Previous Message Peter Eisentraut 2003-09-25 11:35:40 German translation of PostgreSQL documentation