Skip site navigation (1) Skip section navigation (2)

Re: Weird issue with planner choosing seq scan

From: Sean Leach <sleach(at)wiggum(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-performance <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Weird issue with planner choosing seq scan
Date: 2008-02-24 18:41:26
Message-ID: 7ECB45D8-33ED-45FC-A0DA-D3DE25B6CC8E@wiggum.com (view raw or flat)
Thread:
Lists: pgsql-performance
Nope, seems like that would make sense but dev is 10 mill, prod is 3  
million.   Also including random_page_cost below.  Thanks for any help.


Here is dev:

db=> analyze u_counts;
ANALYZE
Time: 15775.161 ms

db=> select count(1) from u_counts;
   count
----------
  10972078
(1 row)

db=> show random_page_cost;
  random_page_cost
------------------
  4
(1 row)

Time: 0.543 ms
db=> explain analyze SELECT node,count(*) AS counts FROM u_counts  
c,res r WHERE c.res_id=r.id AND stamp > (current_timestamp - interval  
'1 day') AND r.rtype='udns' AND r.location=1 GROUP BY node;
                                                                         QUERY 
  PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------
  HashAggregate  (cost=12906.12..12906.24 rows=9 width=6) (actual  
time=3135.831..3135.845 rows=9 loops=1)
    ->  Hash Join  (cost=10.42..12538.88 rows=73449 width=6) (actual  
time=0.746..2475.632 rows=391380 loops=1)
          Hash Cond: (c.res_id = r.id)
          ->  Index Scan using u_counts_i2 on db c   
(cost=0.00..10882.33 rows=243105 width=4) (actual time=0.287..1269.651  
rows=391380 loops=1)
                Index Cond: (stamp > (now() - '1 day'::interval))
          ->  Hash  (cost=9.53..9.53 rows=71 width=10) (actual  
time=0.430..0.430 rows=78 loops=1)
                ->  Seq Scan on res r  (cost=0.00..9.53 rows=71  
width=10) (actual time=0.021..0.203 rows=78 loops=1)
                      Filter: (((rtype)::text = 'udns'::text) AND  
(location = 1))
  Total runtime: 3136.000 ms




Now - here is prod:


db=> show random_page_cost;
  random_page_cost
------------------
  4
(1 row)

Time: 0.434 ms

db=> analyze u_counts;
ANALYZE
Time: 179.928 ms

db=> select count(1) from u_counts;
   count
---------
  3292215
(1 row)


db=> explain analyze SELECT node,count(*) AS counts FROM u_counts  
c,res r WHERE c.res_id=r.id AND stamp > (current_timestamp - interval  
'1 day') AND r.rtype='udns' AND r.location=1 GROUP BY node;
                                                                 QUERY  
PLAN
------------------------------------------------------------------------------------------------------------------------------------------
  HashAggregate  (cost=452333.49..452333.59 rows=8 width=6) (actual  
time=13200.887..13200.902 rows=9 loops=1)
    ->  Hash Join  (cost=16.71..451192.74 rows=228149 width=6) (actual  
time=1430.458..11274.073 rows=1036015 loops=1)
          Hash Cond: (c.res_id = r.id)
          ->  Seq Scan on u_counts c  (cost=0.00..444744.45  
rows=1106691 width=4) (actual time=1429.996..7893.178 rows=1036015  
loops=1)
                Filter: (stamp > (now() - '1 day'::interval))
          ->  Hash  (cost=15.88..15.88 rows=67 width=10) (actual  
time=0.363..0.363 rows=60 loops=1)
                ->  Seq Scan on res r  (cost=0.00..15.88 rows=67  
width=10) (actual time=0.046..0.258 rows=60 loops=1)
                      Filter: (((rtype)::text = 'udns'::text) AND  
(location = 1))
  Total runtime: 13201.046 ms
(9 rows)

Time: 13204.686 ms









On Feb 24, 2008, at 9:50 AM, Tom Lane wrote:

> Sean Leach <sleach(at)wiggum(dot)com> writes:
>> I have a table, that in production, currently has a little over 3
>> million records in production.  In development, the same table has
>> about 10 million records (we have cleaned production a few weeks
>> ago).
>
> You mean the other way around, to judge by the rowcounts from EXPLAIN.
>
>>          ->  Index Scan using u_counts_i2 on u_counts c
>> (cost=0.00..53.53 rows=1082 width=4) (actual time=0.277..1224.582
>> rows=392173 loops=1)
>
> I kinda think the devel system wouldn't be using an indexscan e ither
> if it had up-to-date ANALYZE statistics.  But even with the 1082 row
> estimate that seems a remarkably low cost estimate.  Have you been
> playing games with random_page_cost?  Maybe you forgot to duplicate  
> the
> devel system's cost parameters onto the production system?
>
> 			regards, tom lane


In response to

Responses

pgsql-performance by date

Next:From: Tom LaneDate: 2008-02-24 18:57:48
Subject: Re: LISTEN / NOTIFY performance in 8.3
Previous:From: Tom LaneDate: 2008-02-24 17:50:31
Subject: Re: Weird issue with planner choosing seq scan

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group