SQL 'in' vs join.

From: mlw <markw(at)mohawksoft(dot)com>
To: Hackers List <pgsql-hackers(at)postgresql(dot)org>
Subject: SQL 'in' vs join.
Date: 2000-11-29 22:51:54
Message-ID: 3A25888A.56AEF64D@mohawksoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Why is a "select * from table1 where field in (select field from table2
where condition )"

is so dramatically bad compared to:

"select * from table1, table2 where table1.field = table2.field and
condition"

I can't understand why the first query isn't optimized better than the
second one. The 'in' query forces a full table scan (it shouldn't) and
the second one uses the indexes. Does anyone know why?

I know I am no SQL guru, but my gut tells me that the 'in' operator
should be far more efficient than a join.

Here are the actual queries:

cdinfo=# explain select trackid from zsong where muzenbr in (select
muzenbr from ztitles where title = 'Mulan') ;
NOTICE: QUERY PLAN:

Seq Scan on zsong (cost=100000000.00..219321449380756.66 rows=2193213
width=4)
SubPlan
-> Materialize (cost=100000022.50..100000022.50 rows=10 width=4)
-> Seq Scan on ztitles (cost=100000000.00..100000022.50
rows=10 width=4)

cdinfo=# explain select trackid from zsong, ztitles where
ztitles.muzenbr = zsong.muzenbr and title = 'Mulan' ;
NOTICE: QUERY PLAN:

Merge Join (cost=0.00..183664.10 rows=219321 width=12)
-> Index Scan using zsong_muznbr on zsong (cost=0.00..156187.31
rows=2193213 width=8)
-> Index Scan using ztitles_pkey on ztitles (cost=0.00..61.50
rows=10 width=4)

cdinfo=# \d zsong
Table "zsong"
Attribute | Type | Modifier
-----------+-------------------+-------------------------------------------
muzenbr | integer |
disc | integer |
trk | integer |
song | character varying |
trackid | integer | not null default
nextval('trackid'::text)
artistid | integer |
acd | character varying |
Indices: zsong_muznbr,
zsong_pkey

cdinfo=# \d ztitles
Table "ztitles"
Attribute | Type | Modifier
------------+-------------------+----------
muzenbr | integer | not null
artistid | integer |
cat2 | character varying |
cat3 | character varying |
cat4 | character varying |
performer | character varying |
performer2 | character varying |
title | character varying |
artist1 | character varying |
engineer | character varying |
producer | character varying |
labelname | character varying |
catalog | character varying |
distribut | character varying |
released | character varying |
origrel | character varying |
nbrdiscs | character varying |
spar | character varying |
minutes | character varying |
seconds | character varying |
monostereo | character varying |
studiolive | character varying |
available | character(1) |
previews | character varying |
pnotes | character varying |
acd | character varying |
Index: ztitles_pkey

--
http://www.mohawksoft.com

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Joel Burton 2000-11-29 23:49:07 Re: [GENERAL] Warning: Don't delete those /tmp/.PGSQL.* files
Previous Message Tom Lane 2000-11-29 22:27:42 Re: Warning: Don't delete those /tmp/.PGSQL.* files