Re: BUG #7556: "select not in sub query" plan very poor vs "not exists"

From: Craig Ringer <ringerc(at)ringerc(dot)id(dot)au>
To: l1t(at)tom(dot)com
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #7556: "select not in sub query" plan very poor vs "not exists"
Date: 2012-09-20 04:39:39
Message-ID: 505A9E0B.3000600@ringerc.id.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On 09/19/2012 01:48 PM, l1t(at)tom(dot)com wrote:
> The following bug has been logged on the website:
>
> Bug reference: 7556
> Logged by: lt
> Email address: l1t(at)tom(dot)com
> PostgreSQL version: 9.2.0
> Operating system: windows xp
> Description:
>

create table sli_test (id int primary key,info varchar(20));
insert into sli_test select
generate_series(1,1000000),'digoal'||generate_series(1,1000000);
analyze verbose sli_test;
create table sli_test2 (id int not null,info varchar(20));
insert into sli_test2 select
generate_series(1,1000000),'dbase'||generate_series(1,1000000);
analyze verbose sli_test2;

explain select max(a.info)from sli_test a where a.id not in(select
b.id from sli_test2 b where b.id<50000);

> QUERY PLAN
> ---------------------------------------------------------------------------------------
> Aggregate (cost=9241443774.00..9241443774.01 rows=1 width=12)

Here's what I get on 9.1:

regress=# explain select max(a.info)from sli_test a where a.id not in(select
regress(# b.id from sli_test2 b where b.id<50000);
QUERY PLAN

---------------------------------------------------------------------------------
Aggregate (cost=38050.82..38050.83 rows=1 width=12)
-> Seq Scan on sli_test a (cost=18026.82..36800.82 rows=500000
width=12)
Filter: (NOT (hashed SubPlan 1))
SubPlan 1
-> Seq Scan on sli_test2 b (cost=0.00..17906.00 rows=48329
width=4)
Filter: (id < 50000)
(6 rows)

It runs in about 500ms here.

You don't appear to have posted the full query plan, so it's hard to
compare.

In general, `NOT IN` is a poor formulation for a query; you're better
off with a JOIN or with `NOT EXISTS`. See eg

http://stackoverflow.com/questions/12444142/postgresql-how-to-figure-out-missing-numbers-in-a-column-using-generate-series/12444165#12444165

--
Craig Ringer

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message l1t 2012-09-20 05:14:24 BUG #7556 addition info
Previous Message Craig Ringer 2012-09-20 02:22:29 Re: BUG #7558: Postgres service not starting.