NOT IN >2hrs vs EXCEPT < 2 sec.

From: Kevin Traster <kevin(at)mffais(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: NOT IN >2hrs vs EXCEPT < 2 sec.
Date: 2009-01-29 07:01:34
Message-ID: 72188cf00901282301x6abba690p38c3dff0df4bc338@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

2 questions:

1) Different costs for same actions. Doing an explain on 2 nearly identical
queries both involving the same Index scan on same table has 2 widely
different costs for same Index scan 303375872.86 vs. 12576.70

2) Simple query using NOT IN (subquery)was killed after 2 hrs, using the
same query (query) except (query) ran in < 2 sec.

Summary:

On devel box (Unix PG version 8.3.5) with no other database activity or
system activity after immediately completing a vacuum analyze.

The original query (below) was running for over 2 hrs and was killed.:

select distinct ciknum into tmpnocikinowner from cik where ciknum not in
(select cik from owner_cik_master);

est total cost: 303375872.86, for Index Scan: 303375616.75

Simple query broken down: explain select distinct ciknum into
tmpnocikinowner from cik ;

est total cost: 12576.70, for Index Scan: 12064.49

and

select cik from owner_cik_master

est total cost: 2587.36, for Index Scan: N/A

Actual time, the query was killed after 2hrs,

However, we ran:

select distinct ciknum into tmpnocikinowner from cik ; - actual time 861.487
ms
(select ciknum from tmpnocikinowner) except (select cik from
owner_cik_master); - actual time 1328.094 ms

##### Console log below with details ######

devel=# explain select distinct ciknum into tmpnocikinowner from cik where
ciknum not in (select cik from owner_cik_master);
QUERY
PLAN
------------------------------------------------------------------------------------------------
Unique (cost=3506.21..303375872.86 rows=71946 width=8)
-> Index Scan using cik_ciknum_idx on cik (cost=3506.21..303375616.75
rows=102444 width=8)
Filter: (NOT (subplan))
SubPlan
-> Materialize (cost=3506.21..6002.40 rows=186019 width=4)
-> Seq Scan on owner_cik_master (cost=0.00..2684.19
rows=186019 width=4)
(6 rows)

Time: 0.723 ms
devel=# explain select ciknum into tmpnocikinowner from cik where ciknum not
in (select cik from owner_cik_master);
QUERY
PLAN
--------------------------------------------------------------------------------------
Seq Scan on cik (cost=3506.21..303367660.13 rows=102444 width=8)
Filter: (NOT (subplan))
SubPlan
-> Materialize (cost=3506.21..6002.40 rows=186019 width=4)
-> Seq Scan on owner_cik_master (cost=0.00..2684.19 rows=186019
width=4)
(5 rows)

Time: 0.588 ms
devel=# explain select ciknum::int into tmpnocikinowner from cik where
ciknum::int not in (select cik::int from owner_cik_master);
QUERY
PLAN
--------------------------------------------------------------------------------------
Seq Scan on cik (cost=3506.21..303368428.46 rows=102444 width=8)
Filter: (NOT (subplan))
SubPlan
-> Materialize (cost=3506.21..6002.40 rows=186019 width=4)
-> Seq Scan on owner_cik_master (cost=0.00..2684.19 rows=186019
width=4)
(5 rows)

Time: 0.918 ms
devel=# explain select ciknum into tmpnocikinowner from cik
;
QUERY PLAN
-----------------------------------------------------------
Seq Scan on cik (cost=0.00..4107.87 rows=204887 width=8)
(1 row)

Time: 0.438 ms
devel=# explain select distinct ciknum into tmpnocikinowner from cik ;
QUERY
PLAN
-----------------------------------------------------------------------------------------
Unique (cost=0.00..12576.70 rows=143891 width=8)
-> Index Scan using cik_ciknum_idx on cik (cost=0.00..12064.49
rows=204887 width=8)
(2 rows)

Time: 0.468 ms
devel=# select distinct ciknum into tmpnocikinowner from cik ;
SELECT
Time: 861.487 ms

devel=# explain select ciknum from tmpnocikinowner where ciknum not in
(select cik from owner_cik_master);
QUERY
PLAN
--------------------------------------------------------------------------------------
Seq Scan on tmpnocikinowner (cost=3506.21..261092922.31 rows=88168
width=8)
Filter: (NOT (subplan))
SubPlan
-> Materialize (cost=3506.21..6002.40 rows=186019 width=4)
-> Seq Scan on owner_cik_master (cost=0.00..2684.19 rows=186019
width=4)
(5 rows)

Time: 0.629 ms

devel=# explain select cik from owner_cik_master;
QUERY PLAN
------------------------------------------------------------------------
Seq Scan on owner_cik_master (cost=0.00..2684.19 rows=186019 width=4)
(1 row)

Time: 0.415 ms
devel=# explain select ciknum from tmpnocikinowner;;
QUERY PLAN
-----------------------------------------------------------------------
Seq Scan on tmpnocikinowner (cost=0.00..2587.36 rows=176336 width=8)
(1 row)

Time: 0.413 ms
devel=# explain (select ciknum from tmpnocikinowner) except (select cik
from owner_cik_master);
QUERY
PLAN
------------------------------------------------------------------------------------------------
SetOp Except (cost=47309.23..49121.00 rows=36236 width=8)
-> Sort (cost=47309.23..48215.12 rows=362355 width=8)
Sort Key: "*SELECT* 1".ciknum
-> Append (cost=0.00..8895.10 rows=362355 width=8)
-> Subquery Scan "*SELECT* 1" (cost=0.00..4350.72
rows=176336 width=8)
-> Seq Scan on tmpnocikinowner (cost=0.00..2587.36
rows=176336 width=8)
-> Subquery Scan "*SELECT* 2" (cost=0.00..4544.38
rows=186019 width=4)
-> Seq Scan on owner_cik_master (cost=0.00..2684.19
rows=186019 width=4)
(8 rows)

Time: 0.625 ms
devel=# (select ciknum from tmpnocikinowner) except (select cik from
owner_cik_master);
ciknum
--------
(0 rows)

Time: 1328.094 ms

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Scott Marlowe 2009-01-29 07:37:20 Re: NOT IN >2hrs vs EXCEPT < 2 sec.
Previous Message Scott Marlowe 2009-01-28 20:03:25 Re: [PERFORMANCE] Buying hardware