Skip site navigation (1) Skip section navigation (2)

Re: Query on view radically slower than query on underlying table

From: Craig James <craig_james(at)emolecules(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Query on view radically slower than query on underlying table
Date: 2011-02-28 19:23:52
Message-ID: 4D6BF648.8090105@emolecules.com (view raw or flat)
Thread:
Lists: pgsql-performance
> Craig James<craig_james(at)emolecules(dot)com>  writes:
>> Here is the "bad" query, which is run on the view:
>>
>> em=> explain analyze
>> select version.version_id, version.isosmiles
>> from hitlist_rows_reset_140
>> left join version on (hitlist_rows_reset_140.objectid = version.version_id)
>> where hitlist_rows_reset_140.sortorder >= 1
>> and hitlist_rows_reset_140.sortorder <= 10
>> order by hitlist_rows_reset_140.sortorder;
>> QUERY PLAN
>>
>> -----------------------------------------------------------------------------------------------------------------------------
>> ------------------------------
>> Nested Loop Left Join (cost=23687.51..215315.74 rows=1 width=54) (actual time=2682.662..63680.076 rows=10 loops=1)
>> Join Filter: (hitlist_rows_reset_140.objectid = v.version_id)
>> -> Index Scan using hitlist_rows_reset_140_pkey on hitlist_rows_reset_140 (cost=0.00..8.36 rows=1 width=8) (actual time=
>> 0.015..0.049 rows=10 loops=1)
>> Index Cond: ((sortorder >= 1) AND (sortorder <= 10))
>> -> Hash Join (cost=23687.51..204666.54 rows=851267 width=50) (actual time=31.829..6263.403 rows=851267 loops=10)
>> Hash Cond: (v.version_id = mv.version_id)
>> -> Seq Scan on version v (cost=0.00..116146.68 rows=5631968 width=50) (actual time=0.006..859.758 rows=5632191 loo
>> ps=10)
>> -> Hash (cost=13046.67..13046.67 rows=851267 width=4) (actual time=317.488..317.488 rows=851267 loops=1)
>> -> Seq Scan on my_version mv (cost=0.00..13046.67 rows=851267 width=4) (actual time=2.888..115.166 rows=8512
>> 67 loops=1)
>> Total runtime: 63680.162 ms

On 2/28/11 10:57 AM, Tom Lane wrote:
>> My guess (and it's just a wild guess) is that the "left join" is
>> forcing a sequence scan or something.
>
> No, that's forcing the other join to be done in toto because it can't
> reorder the left join and regular join.

I change the "left join" to just "join" and confirmed that it's fast -- the join on the view drops from 65 seconds back down to a few milliseconds.

Then I thought maybe putting a foreign-key constraint on table "my_version" would solve the problem:

   alter table my_version add constraint fk_my_view foreign key(version_id)
   references registry.version(version_id) on delete cascade;

That way, the planner would know that every key in table "my_version" has to also be in table "version", thus avoiding that part about "forcing the other join to be done in toto".  But the foreign-key constraint makes no difference, it still does the full join and takes 65 seconds.

So here's how I see it:

   - The select can only return ten rows from table "hitlist_rows_reset_140"
   - The left join could be applied to table "my_version"
   - The results of that could be joined to table "version"

It seems to me that with the foreign-key constraint, it shouldn't have to examine more than ten rows from any of the three tables.  Or have I overlooked something?

Thanks,
Craig

In response to

Responses

pgsql-performance by date

Next:From: Tom LaneDate: 2011-02-28 19:58:28
Subject: Re: Query on view radically slower than query on underlying table
Previous:From: Josh BerkusDate: 2011-02-28 19:09:55
Subject: Anyone tried Flashcache with PostgreSQL?

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group