From: | pgsql-bugs(at)postgresql(dot)org |
---|---|
To: | pgsql-bugs(at)postgresql(dot)org |
Subject: | Bug #598: optimizer: convert 'IN' to join |
Date: | 2002-02-20 21:29:59 |
Message-ID: | 200202202129.g1KLTxH28125@postgresql.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
piers haken (piersh(at)friskit(dot)com) reports a bug with a severity of 3
The lower the number the more severe it is.
Short Description
optimizer: convert 'IN' to join
Long Description
the optimizer should do better than a sequential scan with statements like:
SELECT * FROM t1 WHERE t1.index IN (SELECT t2.index FROM t2);
this gives:
Seq Scan on t1
SubPlan
-> Seq scan on t2
this is equivalent to (and should be transformed to)
SELECT t1.* FROM t1, t2 WHERE t1.index = t2.index;
which gives the much faster:
Nested Loop
-> Seq Scan on t1
-> Index Scan using t2_pkey on t2
FYI: SQL Server generates a hash table from t1 and probes it with pkey values of t2 read from a NON-primary index. For SQL Server, scanning a non-primary key takes fewer disk reads than scanning a primary key.
if you add a condition to the subquery on a non-unique column:
SELECT * FROM t1 WHERE t1.index IN (SELECT t2.index FROM t2 WHERE t2.value='something');
or
SELECT t1.* FROM t1, t2 WHERE t1.index=t2.index AND t2.value='something';
you also get two different plans:
Seq Scan on t1
SubPlan
-> Materialize
-> Index Scan using ix_t2_value on t2
as opposed to
Hash Join
-> Seq Scan on t1
-> Hash
-> Index Scan using ix_t2_value on t2
Sample Code
No file was uploaded with this report
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2002-02-20 21:31:27 | Re: [GENERAL] schema error upgrading from 7.1 to 7.2 |
Previous Message | Vivek Khera | 2002-02-20 21:00:22 | schema error upgrading from 7.1 to 7.2 |