Postgres ignoring index when using left outer join.

From: Matthew Schumacher <matt(dot)s(at)aptalaska(dot)net>
To: pgsql-performance(at)postgresql(dot)org
Subject: Postgres ignoring index when using left outer join.
Date: 2007-11-21 04:14:31
Message-ID: 4743B0A7.2030500@aptalaska.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Anyone know what is up with this? I have two queries here which return
the same results, one uses a left outer join to get some data from a
table which may not match a constraint, and one that uses a union to get
the data from each constraint and put them together. The second one
isn't nearly as elegant but is 100 times faster. Any ideas?

Host is linux 2.6.21.7.
Postgres version is 8.2.4.

schu

explain analyze select m.messageid from message_tab m, recipient_tab r
left outer join alias_tab a on a.alias = r.recipient where
m.messagesendmailid = r.messagesendmailid and ( r.recipient = '<email>'
or a.email = '<email>' ) order by m.messageid;

Sort (cost=251959.33..253060.77 rows=440575 width=4) (actual
time=27388.707..27389.431 rows=1183 loops=1)
Sort Key: m.messageid
-> Hash Join (cost=165940.27..204634.07 rows=440575 width=4)(actual
time=24156.311..27387.128 rows=1183 loops=1)
Hash Cond: ((r.messagesendmailid)::text =
(m.messagesendmailid)::text)
-> Hash Left Join (cost=1.04..21379.06 rows=440575 width=18)
(actual time=25.755..2985.690 rows=1680 loops=1)
Hash Cond: ((r.recipient)::text = (a.alias)::text)
Filter: (((r.recipient)::text = '<email>'::text) OR
((a.email)::text = '<email>'::text))
-> Seq Scan on recipient_tab r (cost=0.00..18022.93
rows=879493 width=43) (actual time=12.217..2175.630 rows=875352 loops=1)
-> Hash (cost=1.02..1.02 rows=2 width=136) (actual
time=1.723..1.723 rows=2 loops=1)
-> Seq Scan on alias_tab a (cost=0.00..1.02
rows=2 width=136) (actual time=1.708..1.713 rows=2 loops=1)
-> Hash (cost=154386.99..154386.99 rows=612899 width=22)
(actual time=23979.297..23979.297 rows=630294 loops=1)
-> Seq Scan on message_tab m (cost=0.00..154386.99
rows=612899 width=22) (actual time=60.388..23027.945 rows=630294 loops=1)
Total runtime: 27391.457 ms
(13 rows)

explain analyze select messageid from ( select m.messageid from
message_tab m, recipient_tab r where m.messagesendmailid =
r.messagesendmailid and r.recipient = '<email>' union select m.messageid
from message_tab m, recipient_tab r, alias_tab a where
m.messagesendmailid = r.messagesendmailid and r.recipient = a.alias and
a.email = '<email>') as query;

QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Unique (cost=37090.78..37108.56 rows=3556 width=4) (actual
time=24.784..27.078 rows=1183 loops=1)
-> Sort (cost=37090.78..37099.67 rows=3556 width=4) (actual
time=24.781..25.516 rows=1183 loops=1)
Sort Key: messageid
-> Append (cost=57.32..36881.05 rows=3556 width=4) (actual
time=0.516..23.300 rows=1183 loops=1)
-> Nested Loop (cost=57.32..17618.98 rows=1656 width=4)
(actual time=0.514..15.268 rows=802 loops=1)
-> Bitmap Heap Scan on recipient_tab r
(cost=57.32..4307.33 rows=1656 width=18) (actual time=0.492..1.563
rows=1299 loops=1)
Recheck Cond: ((recipient)::text =
'omatthews(at)design-pt(dot)com'::text)
-> Bitmap Index Scan on
recipient_recipient_idx (cost=0.00..56.90 rows=1656 width=0) (actual
time=0.471..0.471 rows=1299 loops=1)
Index Cond: ((recipient)::text =
'<email>'::text)
-> Index Scan using message_messagesendmailid_idx
on message_tab m (cost=0.00..8.03 rows=1 width=22) (actual
time=0.008..0.008 rows=1 loops=1299)
Index Cond: ((m.messagesendmailid)::text =
(r.messagesendmailid)::text)
-> Nested Loop (cost=67.21..19226.51 rows=1900 width=4)
(actual time=0.337..6.666 rows=381 loops=1)
-> Nested Loop (cost=67.21..4769.43 rows=1900
width=18) (actual time=0.323..1.702 rows=381 loops=1)
-> Seq Scan on alias_tab a (cost=0.00..1.02
rows=1 width=68) (actual time=0.018..0.020 rows=1 loops=1)
Filter: ((email)::text = '<email>'::text)
-> Bitmap Heap Scan on recipient_tab r
(cost=67.21..4744.66 rows=1900 width=43) (actual time=0.296..1.186
rows=381 loops=1)
Recheck Cond: ((r.recipient)::text =
(a.alias)::text)
-> Bitmap Index Scan on
recipient_recipient_idx (cost=0.00..66.73 rows=1900 width=0) (actual
time=0.206..0.206 rows=381 loops=1)
Index Cond: ((r.recipient)::text
= (a.alias)::text)
-> Index Scan using message_messagesendmailid_idx
on message_tab m (cost=0.00..7.60 rows=1 width=22) (actual
time=0.009..0.010 rows=1 loops=381)
Index Cond: ((m.messagesendmailid)::text =
(r.messagesendmailid)::text)
Total runtime: 27.827 ms
(22 rows)

Browse pgsql-performance by date

  From Date Subject
Next Message giuseppe-r@tiscali.it 2007-11-22 14:09:52 tuning for TPC-C benchmark
Previous Message Greg Smith 2007-11-19 17:22:57 Re: autovacuum: recommended?