Quick Links

Re: How to improve speed of 3 table join &group (HUGE tables)

From:	John Major <major(at)cbio(dot)mskcc(dot)org>
To:	Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
Cc:	pgsql-performance(at)postgresql(dot)org
Subject:	Re: How to improve speed of 3 table join &group (HUGE tables)
Date:	2007-10-18 20:04:57
Message-ID:	4717BC69.2010809@cbio.mskcc.org
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

Hi Hekki-

When I turn seq_scan off for the new query:

explain
select sf.library_id, fio.clip_type , count(sf.sequence_id)
from sequence_fragment sf, fragment_external_info fio
where sf.seq_frag_id = fio.sequence_frag_id
and sf.sequence_id IN
(SELECT sequence_id from sequence_alignment)
group by sf.library_id, fio.clip_type

The index is used... but the cost gets worse!
it goes from:
11831119
-TO-
53654888

Actually... The new query executes in ~ 15 minutes... which is good
enough for me for now.

Thanks Nis!

john

Heikki Linnakangas wrote:
> John Major wrote:
>
>> ~there are indexes on all of the fields being joined (but not on
>> library_id or clip_type ). ~Everything has been re-analyzed post index
>> creation
>> ~I've tried "set enable_seqscan=off" and set (join_table_order or
>> something) = 1
>>
>
> Seqscanning and sorting a table is generally faster than a full scan of
> the table using an index scan, unless the heap is roughly in the index
> order. You probably need to CLUSTER the tables to use the indexes
> effectively.
>
> Are you sure you have an index on sequence_alignment.sequence_id? The
> planner seems to choose a seqscan + sort, even though you've set
> enable_seqscan=false.
>
>

In response to

Re: How to improve speed of 3 table join &group (HUGE tables) at 2007-10-18 19:06:11 from Heikki Linnakangas

Browse pgsql-performance by date

	From	Date	Subject
Next Message	ismo.tuononen	2007-10-19 04:40:17	Re: How to improve speed of 3 table join &group (HUGE tables)
Previous Message	Chris Kratz	2007-10-18 20:03:50	Re: Incorrect estimates on columns