Huge overestimation in rows expected results in bad plan

From: bricklen <bricklen(at)gmail(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Huge overestimation in rows expected results in bad plan
Date: 2010-11-09 21:26:47
Message-ID: AANLkTim=JS-tu5G1CvrPnvfSTj7=YSxu=kwNUP+hLVfG@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi,

I have a query that is getting a pretty bad plan due to a massively
incorrect count of expected rows. All tables in the query were vacuum
analyzed right before the query was tested. Disabling nested loops
gives a significantly faster result (4s vs 292s).
Any thoughts on what I can change to make the planner generate a better plan?

32GB ram
effective_cache_size = 16GB
shared_buffers = 4GB
random_page_cost = 1.5
default_statistics_target = 100
Note: for the tables in question, I tested default_statistics_target
at 100, then also at 5000 to see if there was an improvement (none
noted).

select version();
version
------------------------------------------------------------------------------------------------------------------
PostgreSQL 8.4.2 on x86_64-redhat-linux-gnu, compiled by GCC gcc
(GCC) 4.1.2 20071124 (Red Hat 4.1.2-42), 64-bit

explain analyze
select c.id, c.transactionid, c.clickgenerated, c.confirmed,
c.rejected, cr.rejectedreason
from conversion c
inner join conversionrejected cr on cr.idconversion = c.id
where date = '2010-11-06'
and idaction = 12906
and idaffiliate = 198338
order by transactionid;

QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=2318120.52..2345652.23 rows=11012683 width=78) (actual
time=292668.896..292668.903 rows=70 loops=1)
Sort Key: c.transactionid
Sort Method: quicksort Memory: 43kB
-> Nested Loop (cost=1234.69..715468.13 rows=11012683 width=78)
(actual time=8687.314..292668.159 rows=70 loops=1)
Join Filter: ((cr.idconversion = c.id) OR (c.id = 38441828354::bigint))
-> Append (cost=1234.69..1244.03 rows=2 width=56) (actual
time=15.292..15.888 rows=72 loops=1)
-> Bitmap Heap Scan on conversion c
(cost=1234.69..1240.76 rows=1 width=31) (actual time=15.291..15.840
rows=72 loops=1)
Recheck Cond: ((idaffiliate = 198338) AND (date =
'2010-11-06'::date))
Filter: (idaction = 12906)
-> BitmapAnd (cost=1234.69..1234.69 rows=4
width=0) (actual time=15.152..15.152 rows=0 loops=1)
-> Bitmap Index Scan on
conversion_idaffiliate_idx (cost=0.00..49.16 rows=3492 width=0)
(actual time=4.071..4.071 rows=28844 loops=1)
Index Cond: (idaffiliate = 198338)
-> Bitmap Index Scan on
conversion_date_idx (cost=0.00..1185.28 rows=79282 width=0) (actual
time=10.343..10.343 rows=82400 loops=1)
Index Cond: (date = '2010-11-06'::date)
-> Index Scan using conversionlate_date_idx on
conversionlate c (cost=0.00..3.27 rows=1 width=80) (actual
time=0.005..0.005 rows=0 loops=1)
Index Cond: (date = '2010-11-06'::date)
Filter: ((idaction = 12906) AND (idaffiliate = 198338))
-> Seq Scan on conversionrejected cr (cost=0.00..191921.82
rows=11012682 width=31) (actual time=0.003..1515.816 rows=11012682
loops=72)
Total runtime: 292668.992 ms

select count(*) from conversionrejected ;
count
----------
11013488

Time: 3649.647 ms

select count(*) from conversion where date = '2010-11-06';
count
-------
82400

Time: 507.985 ms

select count(*) from conversion;
count
----------
73419376(1 row)

Time: 7100.619 ms

-- with enable_nestloop to off;
-- much faster!

QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=234463.54..234463.54 rows=2 width=78) (actual
time=4035.340..4035.347 rows=70 loops=1)
Sort Key: c.transactionid
Sort Method: quicksort Memory: 43kB
-> Hash Join (cost=1244.13..234463.53 rows=2 width=78) (actual
time=4024.816..4034.715 rows=70 loops=1)
Hash Cond: (cr.idconversion = c.id)
-> Seq Scan on conversionrejected cr (cost=0.00..191921.82
rows=11012682 width=31) (actual time=0.003..1949.597 rows=11013576
loops=1)
-> Hash (cost=1244.11..1244.11 rows=2 width=56) (actual
time=19.312..19.312 rows=72 loops=1)
-> Append (cost=1234.77..1244.11 rows=2 width=56)
(actual time=18.539..19.261 rows=72 loops=1)
-> Bitmap Heap Scan on conversion c
(cost=1234.77..1240.83 rows=1 width=31) (actual time=18.538..19.235
rows=72 loops=1)
Recheck Cond: ((idaffiliate = 198338) AND
(date = '2010-11-06'::date))
Filter: (idaction = 12906)
-> BitmapAnd (cost=1234.77..1234.77
rows=4 width=0) (actual time=18.237..18.237 rows=0 loops=1)
-> Bitmap Index Scan on
conversion_idaffiliate_idx (cost=0.00..49.16 rows=3492 width=0)
(actual time=4.932..4.932 rows=28844 loops=1)
Index Cond: (idaffiliate = 198338)
-> Bitmap Index Scan on
conversion_date_idx (cost=0.00..1185.36 rows=79292 width=0) (actual
time=12.473..12.473 rows=82400 loops=1)
Index Cond: (date = '2010-11-06'::date)
-> Index Scan using conversionlate_date_idx on
conversionlate c (cost=0.00..3.27 rows=1 width=80) (actual
time=0.006..0.006 rows=0 loops=1)
Index Cond: (date = '2010-11-06'::date)
Filter: ((idaction = 12906) AND
(idaffiliate = 198338))
Total runtime: 4035.439 ms

-- for completeness,
-- same query, on 9.0.0, underpowered server, 2 disks mirrored.
Approximately the same table sizes/counts.

select version();
version
-------------------------------------------------------------------------------------------------------------------
PostgreSQL 9.0.0 on x86_64-unknown-linux-gnu, compiled by GCC gcc
(GCC) 4.1.2 20080704 (Red Hat 4.1.2-48), 64-bit

QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=120727.25..120727.25 rows=2 width=78) (actual
time=3214.827..3214.867 rows=70 loops=1)
Sort Key: c.transactionid
Sort Method: quicksort Memory: 43kB
-> Nested Loop (cost=697.95..120727.24 rows=2 width=78) (actual
time=2955.321..3214.208 rows=70 loops=1)
-> Append (cost=697.95..120712.87 rows=2 width=56) (actual
time=2931.584..3173.402 rows=72 loops=1)
-> Bitmap Heap Scan on conversion c
(cost=697.95..120706.59 rows=1 width=31) (actual
time=2931.582..3150.231 rows=72 loops=1)
Recheck Cond: (date = '2010-11-06'::date)
Filter: ((idaction = 12906) AND (idaffiliate = 198338))
-> Bitmap Index Scan on conversion_date_idx
(cost=0.00..697.95 rows=44365 width=0) (actual time=51.692..51.692
rows=82400 loops=1)
Index Cond: (date = '2010-11-06'::date)
-> Index Scan using conversionlate_idaffiliate_idx on
conversionlate c (cost=0.00..6.27 rows=1 width=80) (actual
time=23.091..23.091 rows=0 loops=1)
Index Cond: (idaffiliate = 198338)
Filter: ((date = '2010-11-06'::date) AND
(idaction = 12906))
-> Index Scan using conversionrejected_pk on
conversionrejected cr (cost=0.00..7.17 rows=1 width=31) (actual
time=0.563..0.564 rows=1 loops=72)
Index Cond: (cr.idconversion = c.id)
Total runtime: 3214.972 ms

Thanks,

Bricklen

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Andy Colson 2010-11-09 22:48:04 Re: Huge overestimation in rows expected results in bad plan
Previous Message Kevin Grittner 2010-11-09 21:18:13 anti-join chosen even when slower than old plan