Skip site navigation (1) Skip section navigation (2)

Random plan selection in DBT-3

From: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Random plan selection in DBT-3
Date: 2008-01-07 02:34:12
Message-ID: 20080107112257.647E.ITAGAKI.TAKAHIRO@oss.ntt.co.jp (view raw or flat)
Thread:
Lists: pgsql-hackers
I tested 8.3beta4 with DBT-3 (TPC-H) and found unstable selection of plans.
Planner randomly selected two types of plans (A, B) when I repeated EXPLAIN.

One of the conditions is used in Seq Scan Filter in Plan A. In contrast,
the same condition is used in Hash Join filter in Plan B. Plan A is
faster than B because the condition is used early. Cost of the plan A is
also cheap then B (A:307307.47 vs. B:351706.51).

Where did the randomness come from? Are there large randomness in planner?
I wonder why the worse plan is picked in spite of significantly different costs.

---- Outline of Plan A ----
 Hash Join
   -> Seq Scan
        Filter: (*condition*)
   -> Hash
        -> Seq Scan

---- Outline of Plan B  ----
 Hash Join
    Join Filter: (*condition*)
    -> Seq Scan
    -> Hash
         -> Seq Scan

---- Query ----
dbt3=# EXPLAIN SELECT sum(l_extendedprice* (1 - l_discount)) AS revenue
 FROM lineitem, part
 WHERE ( p_partkey = l_partkey
     AND p_brand = 'Brand#53'
     AND p_container IN ('SM CASE', 'SM BOX', 'SM PACK', 'SM PKG')
     AND l_quantity >= 10
     AND l_quantity <= 10+10
     AND p_size BETWEEN 1 AND 5
     AND l_shipmode IN ('AIR', 'AIR REG')
     AND l_shipinstruct = 'DELIVER IN PERSON' )
     OR ( p_partkey = l_partkey
     AND p_brand = 'Brand#45'
     AND p_container IN ('MED BAG', 'MED BOX', 'MED PKG', 'MED PACK')
     AND l_quantity >= 20
     AND l_quantity <= 20+10
     AND p_size BETWEEN 1 AND 10
     AND l_shipmode IN ('AIR', 'AIR REG')
     AND l_shipinstruct = 'DELIVER IN PERSON' )
     OR ( p_partkey = l_partkey
     AND p_brand = 'Brand#31'
     AND p_container IN ('LG CASE', 'LG BOX', 'LG PACK', 'LG PKG')
     AND l_quantity >= 30
     AND l_quantity <= 30+10
     AND p_size BETWEEN 1 AND 15
     AND l_shipmode IN ('AIR', 'AIR REG')
     AND l_shipinstruct = 'DELIVER IN PERSON' );

---- Details of Plan A ----
 Aggregate  (cost=307307.46..307307.47 rows=1 width=8)
   ->  Hash Join  (cost=10432.00..307307.17 rows=112 width=8)
         Hash Cond: (lineitem.l_partkey = part.p_partkey)
         Join Filter: (((part.p_brand = 'Brand#53'::bpchar)
                  AND (part.p_container = ANY ('{"SM CASE","SM BOX","SM PACK","SM PKG"}'::bpchar[]))
                  AND (lineitem.l_quantity >= 10::double precision)
                  AND (lineitem.l_quantity <= 20::double precision)
                  AND (part.p_size <= 5))
                   OR ((part.p_brand = 'Brand#45'::bpchar)
                  AND (part.p_container = ANY ('{"MED BAG","MED BOX","MED PKG","MED PACK"}'::bpchar[]))
                  AND (lineitem.l_quantity >= 20::double precision)
                  AND (lineitem.l_quantity <= 30::double precision)
                  AND (part.p_size <= 10)) OR ((part.p_brand = 'Brand#31'::bpchar)
                  AND (part.p_container = ANY ('{"LG CASE","LG BOX","LG PACK","LG PKG"}'::bpchar[]))
                  AND (lineitem.l_quantity >= 30::double precision)
                  AND (lineitem.l_quantity <= 40::double precision)
                  AND (part.p_size <= 15)))
         ->  Seq Scan on lineitem  (cost=0.00..287087.85 rows=112605 width=16)
(*)            Filter: ((l_shipmode = ANY ('{AIR,"AIR REG"}'::bpchar[]))
                   AND (l_shipinstruct = 'DELIVER IN PERSON'::bpchar)
                   AND (((l_quantity >= 10::double precision)
                   AND (l_quantity <= 20::double precision))
                    OR ((l_quantity >= 20::double precision)
                   AND (l_quantity <= 30::double precision))
                    OR ((l_quantity >= 30::double precision)
                   AND (l_quantity <= 40::double precision))))
         ->  Hash  (cost=6564.00..6564.00 rows=200000 width=30)
               ->  Seq Scan on part  (cost=0.00..6564.00 rows=200000 width=30)
                     Filter: (p_size >= 1)

---- Details of Plan B ----
 Aggregate  (cost=351706.50..351706.51 rows=1 width=8)
   ->  Hash Join  (cost=10432.00..351706.22 rows=112 width=8)
         Hash Cond: (lineitem.l_partkey = part.p_partkey)
         Join Filter: (((part.p_brand = 'Brand#53'::bpchar)
                  AND (part.p_container = ANY ('{"SM CASE","SM BOX","SM PACK","SM PKG"}'::bpchar[]))
                  AND (lineitem.l_quantity >= 10::double precision)
                  AND (lineitem.l_quantity <= 20::double precision)
                  AND (part.p_size <= 5)
(*)               AND (lineitem.l_shipmode = ANY ('{AIR,"AIR REG"}'::bpchar[])))
                   OR ((part.p_brand = 'Brand#45'::bpchar)
                  AND (part.p_container = ANY ('{"MED BAG","MED BOX","MED PKG","MED PACK"}'::bpchar[]))
                  AND (lineitem.l_quantity >= 20::double precision)
                  AND (lineitem.l_quantity <= 30::double precision)
                  AND (part.p_size <= 10)
(*)               AND (lineitem.l_shipmode = ANY ('{AIR,"AIR REG"}'::bpchar[])))
                   OR ((part.p_brand = 'Brand#31'::bpchar)
                  AND (part.p_container = ANY ('{"LG CASE","LG BOX","LG PACK","LG PKG"}'::bpchar[]))
                  AND (lineitem.l_quantity >= 30::double precision)
                  AND (lineitem.l_quantity <= 40::double precision)
                  AND (part.p_size <= 15)
(*)               AND (lineitem.l_shipmode = ANY ('{AIR,"AIR REG"}'::bpchar[]))))
         ->  Seq Scan on lineitem  (cost=0.00..272084.61 rows=787029 width=27)
               Filter: ((l_shipinstruct = 'DELIVER IN PERSON'::bpchar)
                   AND (((l_quantity >= 10::double precision)
                   AND (l_quantity <= 20::double precision))
                    OR ((l_quantity >= 20::double precision)
                   AND (l_quantity <= 30::double precision))
                    OR ((l_quantity >= 30::double precision)
                   AND (l_quantity <= 40::double precision))))
         ->  Hash  (cost=6564.00..6564.00 rows=200000 width=30)
               ->  Seq Scan on part  (cost=0.00..6564.00 rows=200000 width=30)
                     Filter: (p_size >= 1)



Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center


Responses

pgsql-hackers by date

Next:From: Tom LaneDate: 2008-01-07 02:54:49
Subject: Re: Random plan selection in DBT-3
Previous:From: Tom LaneDate: 2008-01-07 02:22:09
Subject: Re: Index trouble with 8.3b4

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group