Re: Query on view radically slower than query on underlying table

From: Craig James <craig_james(at)emolecules(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Query on view radically slower than query on underlying table
Date: 2011-02-28 19:23:52
Message-ID: 4D6BF648.8090105@emolecules.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

> Craig James<craig_james(at)emolecules(dot)com> writes:
>> Here is the "bad" query, which is run on the view:
>>
>> em=> explain analyze
>> select version.version_id, version.isosmiles
>> from hitlist_rows_reset_140
>> left join version on (hitlist_rows_reset_140.objectid = version.version_id)
>> where hitlist_rows_reset_140.sortorder >= 1
>> and hitlist_rows_reset_140.sortorder <= 10
>> order by hitlist_rows_reset_140.sortorder;
>> QUERY PLAN
>>
>> -----------------------------------------------------------------------------------------------------------------------------
>> ------------------------------
>> Nested Loop Left Join (cost=23687.51..215315.74 rows=1 width=54) (actual time=2682.662..63680.076 rows=10 loops=1)
>> Join Filter: (hitlist_rows_reset_140.objectid = v.version_id)
>> -> Index Scan using hitlist_rows_reset_140_pkey on hitlist_rows_reset_140 (cost=0.00..8.36 rows=1 width=8) (actual time=
>> 0.015..0.049 rows=10 loops=1)
>> Index Cond: ((sortorder >= 1) AND (sortorder <= 10))
>> -> Hash Join (cost=23687.51..204666.54 rows=851267 width=50) (actual time=31.829..6263.403 rows=851267 loops=10)
>> Hash Cond: (v.version_id = mv.version_id)
>> -> Seq Scan on version v (cost=0.00..116146.68 rows=5631968 width=50) (actual time=0.006..859.758 rows=5632191 loo
>> ps=10)
>> -> Hash (cost=13046.67..13046.67 rows=851267 width=4) (actual time=317.488..317.488 rows=851267 loops=1)
>> -> Seq Scan on my_version mv (cost=0.00..13046.67 rows=851267 width=4) (actual time=2.888..115.166 rows=8512
>> 67 loops=1)
>> Total runtime: 63680.162 ms

On 2/28/11 10:57 AM, Tom Lane wrote:
>> My guess (and it's just a wild guess) is that the "left join" is
>> forcing a sequence scan or something.
>
> No, that's forcing the other join to be done in toto because it can't
> reorder the left join and regular join.

I change the "left join" to just "join" and confirmed that it's fast -- the join on the view drops from 65 seconds back down to a few milliseconds.

Then I thought maybe putting a foreign-key constraint on table "my_version" would solve the problem:

alter table my_version add constraint fk_my_view foreign key(version_id)
references registry.version(version_id) on delete cascade;

That way, the planner would know that every key in table "my_version" has to also be in table "version", thus avoiding that part about "forcing the other join to be done in toto". But the foreign-key constraint makes no difference, it still does the full join and takes 65 seconds.

So here's how I see it:

- The select can only return ten rows from table "hitlist_rows_reset_140"
- The left join could be applied to table "my_version"
- The results of that could be joined to table "version"

It seems to me that with the foreign-key constraint, it shouldn't have to examine more than ten rows from any of the three tables. Or have I overlooked something?

Thanks,
Craig

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2011-02-28 19:58:28 Re: Query on view radically slower than query on underlying table
Previous Message Josh Berkus 2011-02-28 19:09:55 Anyone tried Flashcache with PostgreSQL?