Re: sequence scan problem

From: John Beaver <john(dot)e(dot)beaver(at)gmail(dot)com>
To: Jeremy Harris <jgh(at)wizmail(dot)org>
Cc: Pgsql-Performance <pgsql-performance(at)postgresql(dot)org>
Subject: Re: sequence scan problem
Date: 2008-06-30 10:59:00
Message-ID: 4868BC74.6020006@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Ok, here's the explain analyze result. Again, this is Postgres 8.3.3 and
I vacuumed-analyzed both tables directly after they were created.

# explain analyze select fls.function_verified, fls.score,
fls.go_category_group_ref, fs1.gene_ref, fs1.function_verified_exactly,
fs2.gene_ref, fs2.function_verified_exactly from
functional_linkage_scores fls, gene_prediction_view fs1,
gene_prediction_view fs2 where fls.gene_ref1 = fs1.gene_ref and
fls.gene_ref2 = fs2.gene_ref and fs1.go_term_ref = 2 and fs2.go_term_ref
= 2;

QUERY
PLAN

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Merge Join (cost=1399203593.41..6702491234.74 rows=352770803726
width=22) (actual time=6370194.467..22991303.434 rows=15610535128 loops=1)
Merge Cond: (fs2.gene_ref = fls.gene_ref2)
-> Index Scan using gene_prediction_view_gene_ref on
gene_prediction_view fs2 (cost=0.00..12111899.77 rows=197899 width=5)
(actual time=29.592..469838.583 rows=180629 loops=1)
Index Cond: (go_term_ref = 2)
-> Materialize (cost=1399069432.20..1483728633.52 rows=6772736105
width=21) (actual time=6370164.864..16623552.417 rows=15610535121 loops=1)
-> Sort (cost=1399069432.20..1416001272.47 rows=6772736105
width=21) (actual time=6370164.860..13081970.248 rows=1897946790 loops=1)
Sort Key: fls.gene_ref2
Sort Method: external merge Disk: 61192240kB
-> Merge Join (cost=40681244.97..154286110.62
rows=6772736105 width=21) (actual time=592112.778..2043161.851
rows=1897946790 loops=1)
Merge Cond: (fs1.gene_ref = fls.gene_ref1)
-> Index Scan using gene_prediction_view_gene_ref
on gene_prediction_view fs1 (cost=0.00..12111899.77 rows=197899
width=5) (actual time=0.015..246613.129 rows=180644 loops=1)
Index Cond: (go_term_ref = 2)
-> Materialize (cost=40586010.10..43490582.70
rows=232365808 width=20) (actual time=592112.755..1121366.375
rows=1897946783 loops=1)
-> Sort (cost=40586010.10..41166924.62
rows=232365808 width=20) (actual time=592112.721..870349.308
rows=232241678 loops=1)
Sort Key: fls.gene_ref1
Sort Method: external merge Disk:
7260856kB
-> Seq Scan on
functional_linkage_scores fls (cost=0.00..3928457.08 rows=232365808
width=20) (actual time=14.221..86455.902 rows=232241678 loops=1)
Total runtime: 24183346.271 ms
(18 rows)

Jeremy Harris wrote:
> John Beaver wrote:
>> I'm having a strange problem with a query. The query is fairly
>> simple, with a few constants and two joins. All relevant columns
>> should be indexed, and I'm pretty sure there aren't any type
>> conversion issues. But the query plan includes a fairly heavy seq
>> scan. The only possible complication is that the tables involved are
>> fairly large - hundreds of millions of rows each.
>>
>> Can anyone explain this? There should only ever be a maximum of about
>> 50 rows returned when the query is executed.
>
> You didn't say when you last vacuumed?
> If there should only be 50 rows returned then the estimates from the
> planner are way out.
>
> If that doesn't help, we'll need version info, and (if you can afford
> the time) an "explain analyze"
>
> Cheers,
> Jeremy
>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Moritz Onken 2008-06-30 12:46:22 Re: Planner should use index on a LIKE 'foo%' query
Previous Message Dimitri Fontaine 2008-06-30 10:20:40 Re: Planner should use index on a LIKE 'foo%' query