Skip site navigation (1) Skip section navigation (2)

Re: sequence scan problem

From: John Beaver <john(dot)e(dot)beaver(at)gmail(dot)com>
To: Jeremy Harris <jgh(at)wizmail(dot)org>
Cc: Pgsql-Performance <pgsql-performance(at)postgresql(dot)org>
Subject: Re: sequence scan problem
Date: 2008-06-30 10:59:00
Message-ID: 4868BC74.6020006@gmail.com (view raw or flat)
Thread:
Lists: pgsql-performance
Ok, here's the explain analyze result. Again, this is Postgres 8.3.3 and 
I vacuumed-analyzed both tables directly after they were created.


# explain analyze select fls.function_verified, fls.score, 
fls.go_category_group_ref, fs1.gene_ref, fs1.function_verified_exactly, 
fs2.gene_ref, fs2.function_verified_exactly from 
functional_linkage_scores fls, gene_prediction_view fs1, 
gene_prediction_view fs2 where fls.gene_ref1 = fs1.gene_ref and 
fls.gene_ref2 = fs2.gene_ref and fs1.go_term_ref = 2 and fs2.go_term_ref 
= 2;
                                                                                              
QUERY 
PLAN                                                                                              

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Merge Join  (cost=1399203593.41..6702491234.74 rows=352770803726 
width=22) (actual time=6370194.467..22991303.434 rows=15610535128 loops=1)
   Merge Cond: (fs2.gene_ref = fls.gene_ref2)
   ->  Index Scan using gene_prediction_view_gene_ref on 
gene_prediction_view fs2  (cost=0.00..12111899.77 rows=197899 width=5) 
(actual time=29.592..469838.583 rows=180629 loops=1)
         Index Cond: (go_term_ref = 2)
   ->  Materialize  (cost=1399069432.20..1483728633.52 rows=6772736105 
width=21) (actual time=6370164.864..16623552.417 rows=15610535121 loops=1)
         ->  Sort  (cost=1399069432.20..1416001272.47 rows=6772736105 
width=21) (actual time=6370164.860..13081970.248 rows=1897946790 loops=1)
               Sort Key: fls.gene_ref2
               Sort Method:  external merge  Disk: 61192240kB
               ->  Merge Join  (cost=40681244.97..154286110.62 
rows=6772736105 width=21) (actual time=592112.778..2043161.851 
rows=1897946790 loops=1)
                     Merge Cond: (fs1.gene_ref = fls.gene_ref1)
                     ->  Index Scan using gene_prediction_view_gene_ref 
on gene_prediction_view fs1  (cost=0.00..12111899.77 rows=197899 
width=5) (actual time=0.015..246613.129 rows=180644 loops=1)
                           Index Cond: (go_term_ref = 2)
                     ->  Materialize  (cost=40586010.10..43490582.70 
rows=232365808 width=20) (actual time=592112.755..1121366.375 
rows=1897946783 loops=1)
                           ->  Sort  (cost=40586010.10..41166924.62 
rows=232365808 width=20) (actual time=592112.721..870349.308 
rows=232241678 loops=1)
                                 Sort Key: fls.gene_ref1
                                 Sort Method:  external merge  Disk: 
7260856kB
                                 ->  Seq Scan on 
functional_linkage_scores fls  (cost=0.00..3928457.08 rows=232365808 
width=20) (actual time=14.221..86455.902 rows=232241678 loops=1)
 Total runtime: 24183346.271 ms
(18 rows)





Jeremy Harris wrote:
> John Beaver wrote:
>> I'm having a strange problem with a query. The query is fairly 
>> simple, with a few constants and two joins. All relevant columns 
>> should be indexed, and I'm pretty sure there aren't any type 
>> conversion issues. But the query plan includes a fairly heavy seq 
>> scan. The only possible complication is that the tables involved are 
>> fairly large - hundreds of millions of rows each.
>>
>> Can anyone explain this? There should only ever be a maximum of about 
>> 50 rows returned when the query is executed.
>
> You didn't say when you last vacuumed?
> If there should only be 50 rows returned then the estimates from the
> planner are way out.
>
> If that doesn't help, we'll need version info, and (if you can afford
> the time) an "explain analyze"
>
> Cheers,
>  Jeremy
>

In response to

Responses

pgsql-performance by date

Next:From: Moritz OnkenDate: 2008-06-30 12:46:22
Subject: Re: Planner should use index on a LIKE 'foo%' query
Previous:From: Dimitri FontaineDate: 2008-06-30 10:20:40
Subject: Re: Planner should use index on a LIKE 'foo%' query

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group