BUG #6787: Query planner estimates worng costs on nodes

From: gsaviane(at)gmail(dot)com
To: pgsql-bugs(at)postgresql(dot)org
Subject: BUG #6787: Query planner estimates worng costs on nodes
Date: 2012-07-31 15:06:36
Message-ID: E1SwE1s-0002Ek-PV@wrigleys.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

The following bug has been logged on the website:

Bug reference: 6787
Logged by: Giorgio Saviane
Email address: gsaviane(at)gmail(dot)com
PostgreSQL version: 8.4.12
Operating system: Linux
Description:

Hi, I would submit the explain analayze results of a query with and without
enable_seqscan enabled.

This is the query:

select count(*)
from vb_messages msg
INNER JOIN vb_readings r ON msg.vb_reading_id = r.id
INNER JOIN vb_product_readings pr ON pr.vb_reading_id = r.id
where msg.reception_date >= '2012-07-01'
AND msg.reception_date < '2012-07-02'

The explain analyze result with set enable_seqscan=false is:

'Aggregate (cost=17228990.83..17228990.84 rows=1 width=0) (actual
time=1847.380..1847.380 rows=1 loops=1)'
' -> Nested Loop (cost=0.00..17228943.48 rows=18939 width=0) (actual
time=0.064..1567.371 rows=392336 loops=1)'
' -> Nested Loop (cost=0.00..396208.94 rows=11176 width=8) (actual
time=0.049..494.031 rows=35761 loops=1)'
' -> Index Scan using ix_vb_messages_reception_date on
vb_messages msg (cost=0.00..94347.00 rows=81650 width=4) (actual
time=0.024..94.808 rows=101522 loops=1)'
' Index Cond: ((reception_date >= '2012-07-01
00:00:00'::timestamp without time zone) AND (reception_date < '2012-07-02
00:00:00'::timestamp without time zone))'
' -> Index Scan using vb_readings_pkey on vb_readings r
(cost=0.00..3.68 rows=1 width=4) (actual time=0.002..0.002 rows=0
loops=101522)'
' Index Cond: (r.id = msg.vb_reading_id)'
' -> Index Scan using ix_vb_product_readings_vb_reading_id on
vb_product_readings pr (cost=0.00..1490.53 rows=1250 width=4) (actual
time=0.004..0.013 rows=11 loops=35761)'
' Index Cond: (pr.vb_reading_id = msg.vb_reading_id)'
'Total runtime: 1847.439 ms'

The explain analyze result with set enable_seqscan=true is:

'Aggregate (cost=3301249.86..3301249.87 rows=1 width=0) (actual
time=258887.238..258887.239 rows=1 loops=1)'
' -> Hash Join (cost=396345.01..3301202.51 rows=18939 width=0) (actual
time=18401.256..258561.539 rows=392336 loops=1)'
' Hash Cond: (pr.vb_reading_id = msg.vb_reading_id)'
' -> Seq Scan on vb_product_readings pr (cost=0.00..2443541.36
rows=122967136 width=4) (actual time=0.102..130182.321 rows=141509801
loops=1)'
' -> Hash (cost=396205.31..396205.31 rows=11176 width=8) (actual
time=592.774..592.774 rows=35761 loops=1)'
' -> Nested Loop (cost=0.00..396205.31 rows=11176 width=8)
(actual time=0.160..556.412 rows=35761 loops=1)'
' -> Index Scan using ix_vb_messages_reception_date on
vb_messages msg (cost=0.00..94347.00 rows=81650 width=4) (actual
time=0.068..113.526 rows=101522 loops=1)'
' Index Cond: ((reception_date >= '2012-07-01
00:00:00'::timestamp without time zone) AND (reception_date < '2012-07-02
00:00:00'::timestamp without time zone))'
' -> Index Scan using vb_readings_pkey on vb_readings r
(cost=0.00..3.68 rows=1 width=4) (actual time=0.002..0.002 rows=0
loops=101522)'
' Index Cond: (r.id = msg.vb_reading_id)'
'Total runtime: 258887.596 ms'

As you can see the table are all indexed on their foreign keys and fresh
analyzed.

I wonder why the planner estimates a very much lower cost with sequential
scans enabled but the actual execution time is very much higher than that
with sequential scans disabled.
It seems the cost assigned to a sequential scan over 120 million records is
too low or, on the other side, the cost assigned to two nested loop over
three indexes is too high.

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2012-07-31 22:13:07 Re: BUG #6787: Query planner estimates worng costs on nodes
Previous Message Marti Raudsepp 2012-07-31 10:35:44 Re: BUG #6751: usage flaws in pg_restore