Random plan selection in DBT-3

From: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Random plan selection in DBT-3
Date: 2008-01-07 02:34:12
Message-ID: 20080107112257.647E.ITAGAKI.TAKAHIRO@oss.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I tested 8.3beta4 with DBT-3 (TPC-H) and found unstable selection of plans.
Planner randomly selected two types of plans (A, B) when I repeated EXPLAIN.

One of the conditions is used in Seq Scan Filter in Plan A. In contrast,
the same condition is used in Hash Join filter in Plan B. Plan A is
faster than B because the condition is used early. Cost of the plan A is
also cheap then B (A:307307.47 vs. B:351706.51).

Where did the randomness come from? Are there large randomness in planner?
I wonder why the worse plan is picked in spite of significantly different costs.

---- Outline of Plan A ----
Hash Join
-> Seq Scan
Filter: (*condition*)
-> Hash
-> Seq Scan

---- Outline of Plan B ----
Hash Join
Join Filter: (*condition*)
-> Seq Scan
-> Hash
-> Seq Scan

---- Query ----
dbt3=# EXPLAIN SELECT sum(l_extendedprice* (1 - l_discount)) AS revenue
FROM lineitem, part
WHERE ( p_partkey = l_partkey
AND p_brand = 'Brand#53'
AND p_container IN ('SM CASE', 'SM BOX', 'SM PACK', 'SM PKG')
AND l_quantity >= 10
AND l_quantity <= 10+10
AND p_size BETWEEN 1 AND 5
AND l_shipmode IN ('AIR', 'AIR REG')
AND l_shipinstruct = 'DELIVER IN PERSON' )
OR ( p_partkey = l_partkey
AND p_brand = 'Brand#45'
AND p_container IN ('MED BAG', 'MED BOX', 'MED PKG', 'MED PACK')
AND l_quantity >= 20
AND l_quantity <= 20+10
AND p_size BETWEEN 1 AND 10
AND l_shipmode IN ('AIR', 'AIR REG')
AND l_shipinstruct = 'DELIVER IN PERSON' )
OR ( p_partkey = l_partkey
AND p_brand = 'Brand#31'
AND p_container IN ('LG CASE', 'LG BOX', 'LG PACK', 'LG PKG')
AND l_quantity >= 30
AND l_quantity <= 30+10
AND p_size BETWEEN 1 AND 15
AND l_shipmode IN ('AIR', 'AIR REG')
AND l_shipinstruct = 'DELIVER IN PERSON' );

---- Details of Plan A ----
Aggregate (cost=307307.46..307307.47 rows=1 width=8)
-> Hash Join (cost=10432.00..307307.17 rows=112 width=8)
Hash Cond: (lineitem.l_partkey = part.p_partkey)
Join Filter: (((part.p_brand = 'Brand#53'::bpchar)
AND (part.p_container = ANY ('{"SM CASE","SM BOX","SM PACK","SM PKG"}'::bpchar[]))
AND (lineitem.l_quantity >= 10::double precision)
AND (lineitem.l_quantity <= 20::double precision)
AND (part.p_size <= 5))
OR ((part.p_brand = 'Brand#45'::bpchar)
AND (part.p_container = ANY ('{"MED BAG","MED BOX","MED PKG","MED PACK"}'::bpchar[]))
AND (lineitem.l_quantity >= 20::double precision)
AND (lineitem.l_quantity <= 30::double precision)
AND (part.p_size <= 10)) OR ((part.p_brand = 'Brand#31'::bpchar)
AND (part.p_container = ANY ('{"LG CASE","LG BOX","LG PACK","LG PKG"}'::bpchar[]))
AND (lineitem.l_quantity >= 30::double precision)
AND (lineitem.l_quantity <= 40::double precision)
AND (part.p_size <= 15)))
-> Seq Scan on lineitem (cost=0.00..287087.85 rows=112605 width=16)
(*) Filter: ((l_shipmode = ANY ('{AIR,"AIR REG"}'::bpchar[]))
AND (l_shipinstruct = 'DELIVER IN PERSON'::bpchar)
AND (((l_quantity >= 10::double precision)
AND (l_quantity <= 20::double precision))
OR ((l_quantity >= 20::double precision)
AND (l_quantity <= 30::double precision))
OR ((l_quantity >= 30::double precision)
AND (l_quantity <= 40::double precision))))
-> Hash (cost=6564.00..6564.00 rows=200000 width=30)
-> Seq Scan on part (cost=0.00..6564.00 rows=200000 width=30)
Filter: (p_size >= 1)

---- Details of Plan B ----
Aggregate (cost=351706.50..351706.51 rows=1 width=8)
-> Hash Join (cost=10432.00..351706.22 rows=112 width=8)
Hash Cond: (lineitem.l_partkey = part.p_partkey)
Join Filter: (((part.p_brand = 'Brand#53'::bpchar)
AND (part.p_container = ANY ('{"SM CASE","SM BOX","SM PACK","SM PKG"}'::bpchar[]))
AND (lineitem.l_quantity >= 10::double precision)
AND (lineitem.l_quantity <= 20::double precision)
AND (part.p_size <= 5)
(*) AND (lineitem.l_shipmode = ANY ('{AIR,"AIR REG"}'::bpchar[])))
OR ((part.p_brand = 'Brand#45'::bpchar)
AND (part.p_container = ANY ('{"MED BAG","MED BOX","MED PKG","MED PACK"}'::bpchar[]))
AND (lineitem.l_quantity >= 20::double precision)
AND (lineitem.l_quantity <= 30::double precision)
AND (part.p_size <= 10)
(*) AND (lineitem.l_shipmode = ANY ('{AIR,"AIR REG"}'::bpchar[])))
OR ((part.p_brand = 'Brand#31'::bpchar)
AND (part.p_container = ANY ('{"LG CASE","LG BOX","LG PACK","LG PKG"}'::bpchar[]))
AND (lineitem.l_quantity >= 30::double precision)
AND (lineitem.l_quantity <= 40::double precision)
AND (part.p_size <= 15)
(*) AND (lineitem.l_shipmode = ANY ('{AIR,"AIR REG"}'::bpchar[]))))
-> Seq Scan on lineitem (cost=0.00..272084.61 rows=787029 width=27)
Filter: ((l_shipinstruct = 'DELIVER IN PERSON'::bpchar)
AND (((l_quantity >= 10::double precision)
AND (l_quantity <= 20::double precision))
OR ((l_quantity >= 20::double precision)
AND (l_quantity <= 30::double precision))
OR ((l_quantity >= 30::double precision)
AND (l_quantity <= 40::double precision))))
-> Hash (cost=6564.00..6564.00 rows=200000 width=30)
-> Seq Scan on part (cost=0.00..6564.00 rows=200000 width=30)
Filter: (p_size >= 1)

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2008-01-07 02:54:49 Re: Random plan selection in DBT-3
Previous Message Tom Lane 2008-01-07 02:22:09 Re: Index trouble with 8.3b4