Bug #598: optimizer: convert 'IN' to join

From: pgsql-bugs(at)postgresql(dot)org
To: pgsql-bugs(at)postgresql(dot)org
Subject: Bug #598: optimizer: convert 'IN' to join
Date: 2002-02-20 21:29:59
Message-ID: 200202202129.g1KLTxH28125@postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

piers haken (piersh(at)friskit(dot)com) reports a bug with a severity of 3
The lower the number the more severe it is.

Short Description
optimizer: convert 'IN' to join

Long Description
the optimizer should do better than a sequential scan with statements like:

SELECT * FROM t1 WHERE t1.index IN (SELECT t2.index FROM t2);

this gives:
Seq Scan on t1
SubPlan
-> Seq scan on t2

this is equivalent to (and should be transformed to)

SELECT t1.* FROM t1, t2 WHERE t1.index = t2.index;

which gives the much faster:
Nested Loop
-> Seq Scan on t1
-> Index Scan using t2_pkey on t2

FYI: SQL Server generates a hash table from t1 and probes it with pkey values of t2 read from a NON-primary index. For SQL Server, scanning a non-primary key takes fewer disk reads than scanning a primary key.

if you add a condition to the subquery on a non-unique column:

SELECT * FROM t1 WHERE t1.index IN (SELECT t2.index FROM t2 WHERE t2.value='something');
or
SELECT t1.* FROM t1, t2 WHERE t1.index=t2.index AND t2.value='something';

you also get two different plans:

Seq Scan on t1
SubPlan
-> Materialize
-> Index Scan using ix_t2_value on t2

as opposed to

Hash Join
-> Seq Scan on t1
-> Hash
-> Index Scan using ix_t2_value on t2

Sample Code

No file was uploaded with this report

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2002-02-20 21:31:27 Re: [GENERAL] schema error upgrading from 7.1 to 7.2
Previous Message Vivek Khera 2002-02-20 21:00:22 schema error upgrading from 7.1 to 7.2