From: | Josh Berkus <josh(at)agliodbs(dot)com> |
---|---|
To: | Oleg Lebedev <oleg(dot)lebedev(at)waterford(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-performance(at)postgresql(dot)org |
Subject: | Re: TPC-R benchmarks |
Date: | 2003-10-07 23:59:17 |
Message-ID: | 200310071659.17863.josh@agliodbs.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Tom,
I've found the problem with TPC-R query #19. And it, unfortunately, appears
to be a problem in the PostgreSQL query planner.
To sum up the below: it appears that whenever a set of WHERE conditions
exceeds a certain level of complexity, the planner just ignores all
applicable indexes and goes for a seq scan. While this may be unavoidable
to some degree, it seems to me that we need to raise the threshold of
complexity at which it does this.
tpcr=# select version();
version
---------------------------------------------------------------------------------------------------------
PostgreSQL 7.3.4 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.3 20030226
(prerelease) (SuSE Linux)
(1 row)
I've tested a number of indexes on the query, and found the two most efficient
on subsets of the query. Thus:
explain analyze
select
sum(l_extendedprice* (1 - l_discount)) as revenue
from
lineitem,
part
where
(
p_partkey = l_partkey
and p_brand = 'Brand#33'
and p_container in ('SM CASE', 'SM BOX', 'SM PACK', 'SM PKG')
and l_quantity >= 8 and l_quantity <= 8 + 10
and p_size between 1 and 5
and l_shipmode in ('AIR', 'AIR REG')
and l_shipinstruct = 'DELIVER IN PERSON'
);
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=10380.70..10380.70 rows=1 width=30) (actual
time=161.61..161.61 rows=1 loops=1)
-> Nested Loop (cost=0.00..10380.67 rows=13 width=30) (actual
time=81.54..161.47 rows=17 loops=1)
-> Index Scan using idx_part_1 on part (cost=0.00..9466.33 rows=62
width=4) (actual time=81.21..137.24 rows=98 loops=1)
Index Cond: (p_brand = 'Brand#33'::bpchar)
Filter: (((p_container = 'SM CASE'::bpchar) OR (p_container =
'SM BOX'::bpchar) OR (p_container = 'SM PACK'::bpchar) OR (p_container = 'SM
PKG'::bpchar)) AND (p_size >= 1) AND (p_size <= 5))
-> Index Scan using idx_lineitem_3 on lineitem (cost=0.00..14.84
rows=1 width=26) (actual time=0.22..0.24 rows=0 loops=98)
Index Cond: (("outer".p_partkey = lineitem.l_partkey) AND
(lineitem.l_quantity >= 8::numeric) AND (lineitem.l_quantity <= 18::numeric))
Filter: (((l_shipmode = 'AIR'::bpchar) OR (l_shipmode = 'AIR
REG'::bpchar)) AND (l_shipinstruct = 'DELIVER IN PERSON'::bpchar))
Total runtime: 161.71 msec
This also works for a similar query:
explain analyze
select
sum(l_extendedprice* (1 - l_discount)) as revenue
from
lineitem,
part
where
(
p_partkey = l_partkey
and p_brand = 'Brand#52'
and p_container in ('MED BAG', 'MED BOX', 'MED PKG', 'MED PACK')
and l_quantity >= 14 and l_quantity <= 14 + 10
and p_size between 1 and 10
and l_shipmode in ('AIR', 'AIR REG')
and l_shipinstruct = 'DELIVER IN PERSON'
);
Aggregate (cost=11449.36..11449.36 rows=1 width=30) (actual
time=195.72..195.72 rows=1 loops=1)
-> Nested Loop (cost=0.00..11449.29 rows=28 width=30) (actual
time=56.42..195.39 rows=48 loops=1)
-> Index Scan using idx_part_1 on part (cost=0.00..9466.33 rows=139
width=4) (actual time=56.15..153.17 rows=166 loops=1)
Index Cond: (p_brand = 'Brand#52'::bpchar)
Filter: (((p_container = 'MED BAG'::bpchar) OR (p_container =
'MED BOX'::bpchar) OR (p_container = 'MED PKG'::bpchar) OR (p_container =
'MED PACK'::bpchar)) AND (p_size >= 1) AND (p_size <= 10))
-> Index Scan using idx_lineitem_3 on lineitem (cost=0.00..14.29
rows=1 width=26) (actual time=0.23..0.25 rows=0 loops=166)
Index Cond: (("outer".p_partkey = lineitem.l_partkey) AND
(lineitem.l_quantity >= 14::numeric) AND (lineitem.l_quantity <=
24::numeric))
Filter: (((l_shipmode = 'AIR'::bpchar) OR (l_shipmode = 'AIR
REG'::bpchar)) AND (l_shipinstruct = 'DELIVER IN PERSON'::bpchar))
Total runtime: 195.82 msec
(9 rows)
If, however, I combine the two where clauses with an OR, the planner gets
confused and insists on loading the entire tables into memory (even though I
don't have that much memory):
explain
select
sum(l_extendedprice* (1 - l_discount)) as revenue
from
lineitem,
part
where
(
p_partkey = l_partkey
and p_brand = 'Brand#33'
and p_container in ('SM CASE', 'SM BOX', 'SM PACK', 'SM PKG')
and l_quantity >= 8 and l_quantity <= 8 + 10
and p_size between 1 and 5
and l_shipmode in ('AIR', 'AIR REG')
and l_shipinstruct = 'DELIVER IN PERSON'
)
or
(
p_partkey = l_partkey
and p_brand = 'Brand#52'
and p_container in ('MED BAG', 'MED BOX', 'MED PKG', 'MED PACK')
and l_quantity >= 14 and l_quantity <= 14 + 10
and p_size between 1 and 10
and l_shipmode in ('AIR', 'AIR REG')
and l_shipinstruct = 'DELIVER IN PERSON'
);
Aggregate (cost=488301096525.25..488301096525.25 rows=1 width=146)
-> Nested Loop (cost=0.00..488301096525.15 rows=42 width=146)
Join Filter: ((("outer".l_shipmode = 'AIR'::bpchar) AND
("inner".p_container = 'SM CASE'::bpchar) AND ("inner".p_partkey =
"outer".l_partkey) AND ("inner".p_brand = 'Brand#33'::bpchar) AND
("outer".l_quantity >= 8::numeric) AND ("outer".l_quantity <= 18::numeric)
AND ("inner".p_size >= 1) AND ("inner".p_size <= 5) AND
("outer".l_shipinstruct = 'DELIVER IN PERSON'::bpchar)) OR
(("outer".l_shipmode = 'AIR REG'::bpchar) AND ("inner".p_container = 'SM
CASE'::bpchar) AND ("inner".p_partkey = "outer".l_partkey) AND
("inner".p_brand = 'Brand#33'::bpchar) AND ("outer".l_quantity >= 8::numeric)
AND ("outer".l_quantity <= 18::numeric) AND ("inner".p_size >= 1) AND
("inner".p_size <= 5) AND ("outer".l_shipinstruct = 'DELIVER IN
PERSON'::bpchar)) OR (("outer".l_shipmode = 'AIR'::bpchar) AND
("inner".p_container = 'SM BOX'::bpchar) AND ("inner".p_partkey =
"outer".l_partkey) AND ("inner".p_brand = 'Brand#33'::bpchar) AND
("outer".l_quantity >= 8::numeric) AND ("outer".l_quantity <= 18::numeric)
AND ("inner".p_size >= 1) AND ("inner".p_size <= 5) AND
("outer".l_shipinstruct = 'DELIVER IN PERSON'::bpchar)) OR
(("outer".l_shipmode = 'AIR REG'::bpchar) AND ("inner".p_container = 'SM
BOX'::bpchar) AND ("inner".p_partkey = "outer".l_partkey) AND
("inner".p_brand = 'Brand#33'::bpchar) AND ("outer".l_quantity >= 8::numeric)
AND ("outer".l_quantity <= 18::numeric) AND ("inner".p_size >= 1) AND
("inner".p_size <= 5) AND ("outer".l_shipinstruct = 'DELIVER IN
PERSON'::bpchar)) OR (("outer".l_shipmode = 'AIR'::bpchar) AND
("inner".p_container = 'SM PACK'::bpchar) AND ("inner".p_partkey =
"outer".l_partkey) AND ("inner".p_brand = 'Brand#33'::bpchar) AND
("outer".l_quantity >= 8::numeric) AND ("outer".l_quantity <= 18::numeric)
AND ("inner".p_size >= 1) AND ("inner".p_size <= 5) AND
("outer".l_shipinstruct = 'DELIVER IN PERSON'::bpchar)) OR
(("outer".l_shipmode = 'AIR REG'::bpchar) AND ("inner".p_container = 'SM
PACK'::bpchar) AND ("inner".p_partkey = "outer".l_partkey) AND
("inner".p_brand = 'Brand#33'::bpchar) AND ("outer".l_quantity >= 8::numeric)
AND ("outer".l_quantity <= 18::numeric) AND ("inner".p_size >= 1) AND
("inner".p_size <= 5) AND ("outer".l_shipinstruct = 'DELIVER IN
PERSON'::bpchar)) OR (("outer".l_shipmode = 'AIR'::bpchar) AND
("inner".p_container = 'SM PKG'::bpchar) AND ("inner".p_partkey =
"outer".l_partkey) AND ("inner".p_brand = 'Brand#33'::bpchar) AND
("outer".l_quantity >= 8::numeric) AND ("outer".l_quantity <= 18::numeric)
AND ("inner".p_size >= 1) AND ("inner".p_size <= 5) AND
("outer".l_shipinstruct = 'DELIVER IN PERSON'::bpchar)) OR
(("outer".l_shipmode = 'AIR REG'::bpchar) AND ("inner".p_container = 'SM
PKG'::bpchar) AND ("inner".p_partkey = "outer".l_partkey) AND
("inner".p_brand = 'Brand#33'::bpchar) AND ("outer".l_quantity >= 8::numeric)
AND ("outer".l_quantity <= 18::numeric) AND ("inner".p_size >= 1) AND
("inner".p_size <= 5) AND ("outer".l_shipinstruct = 'DELIVER IN
PERSON'::bpchar)) OR (("outer".l_shipmode = 'AIR'::bpchar) AND
("inner".p_container = 'MED BAG'::bpchar) AND ("inner".p_partkey =
"outer".l_partkey) AND ("inner".p_brand = 'Brand#52'::bpchar) AND
("outer".l_quantity >= 14::numeric) AND ("outer".l_quantity <= 24::numeric)
AND ("inner".p_size >= 1) AND ("inner".p_size <= 10) AND
("outer".l_shipinstruct = 'DELIVER IN PERSON'::bpchar)) OR
(("outer".l_shipmode = 'AIR REG'::bpchar) AND ("inner".p_container = 'MED
BAG'::bpchar) AND ("inner".p_partkey = "outer".l_partkey) AND
("inner".p_brand = 'Brand#52'::bpchar) AND ("outer".l_quantity >=
14::numeric) AND ("outer".l_quantity <= 24::numeric) AND ("inner".p_size >=
1) AND ("inner".p_size <= 10) AND ("outer".l_shipinstruct = 'DELIVER IN
PERSON'::bpchar)) OR (("outer".l_shipmode = 'AIR'::bpchar) AND
("inner".p_container = 'MED BOX'::bpchar) AND ("inner".p_partkey =
"outer".l_partkey) AND ("inner".p_brand = 'Brand#52'::bpchar) AND
("outer".l_quantity >= 14::numeric) AND ("outer".l_quantity <= 24::numeric)
AND ("inner".p_size >= 1) AND ("inner".p_size <= 10) AND
("outer".l_shipinstruct = 'DELIVER IN PERSON'::bpchar)) OR
(("outer".l_shipmode = 'AIR REG'::bpchar) AND ("inner".p_container = 'MED
BOX'::bpchar) AND ("inner".p_partkey = "outer".l_partkey) AND
("inner".p_brand = 'Brand#52'::bpchar) AND ("outer".l_quantity >=
14::numeric) AND ("outer".l_quantity <= 24::numeric) AND ("inner".p_size >=
1) AND ("inner".p_size <= 10) AND ("outer".l_shipinstruct = 'DELIVER IN
PERSON'::bpchar)) OR (("outer".l_shipmode = 'AIR'::bpchar) AND
("inner".p_container = 'MED PKG'::bpchar) AND ("inner".p_partkey =
"outer".l_partkey) AND ("inner".p_brand = 'Brand#52'::bpchar) AND
("outer".l_quantity >= 14::numeric) AND ("outer".l_quantity <= 24::numeric)
AND ("inner".p_size >= 1) AND ("inner".p_size <= 10) AND
("outer".l_shipinstruct = 'DELIVER IN PERSON'::bpchar)) OR
(("outer".l_shipmode = 'AIR REG'::bpchar) AND ("inner".p_container = 'MED
PKG'::bpchar) AND ("inner".p_partkey = "outer".l_partkey) AND
("inner".p_brand = 'Brand#52'::bpchar) AND ("outer".l_quantity >=
14::numeric) AND ("outer".l_quantity <= 24::numeric) AND ("inner".p_size >=
1) AND ("inner".p_size <= 10) AND ("outer".l_shipinstruct = 'DELIVER IN
PERSON'::bpchar)) OR (("outer".l_shipmode = 'AIR'::bpchar) AND
("inner".p_container = 'MED PACK'::bpchar) AND ("inner".p_partkey =
"outer".l_partkey) AND ("inner".p_brand = 'Brand#52'::bpchar) AND
("outer".l_quantity >= 14::numeric) AND ("outer".l_quantity <= 24::numeric)
AND ("inner".p_size >= 1) AND ("inner".p_size <= 10) AND
("outer".l_shipinstruct = 'DELIVER IN PERSON'::bpchar)) OR
(("outer".l_shipmode = 'AIR REG'::bpchar) AND ("inner".p_container = 'MED
PACK'::bpchar) AND ("inner".p_partkey = "outer".l_partkey) AND
("inner".p_brand = 'Brand#52'::bpchar) AND ("outer".l_quantity >=
14::numeric) AND ("outer".l_quantity <= 24::numeric) AND ("inner".p_size >=
1) AND ("inner".p_size <= 10) AND ("outer".l_shipinstruct = 'DELIVER IN
PERSON'::bpchar)))
-> Seq Scan on lineitem (cost=0.00..235620.15 rows=6001215
width=95)
-> Seq Scan on part (cost=0.00..7367.00 rows=200000 width=51)
You'll pardon me for not doing an "ANALYZE", but I didn't want to wait
overnight. Manually disabling Seqscan and Nestloop did nothing to affect
this query plan; neither did removing the aggregate.
Tommorrow I will test 7.4 Beta 4.
How can we fix this?
--
-Josh Berkus
Aglio Database Solutions
San Francisco
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2003-10-08 04:15:57 | Re: TPC-R benchmarks |
Previous Message | Timothy D. Witham | 2003-10-07 18:34:33 | Re: TPC-R benchmarks |