Re: another query optimization question

From: David Teran <david(dot)teran(at)cluster9(dot)com>
To: Stephan Szabo <sszabo(at)megazone(dot)bigpanda(dot)com>
Cc: PgSQL Performance ML <pgsql-performance(at)postgresql(dot)org>
Subject: Re: another query optimization question
Date: 2004-01-30 18:20:24
Message-ID: F8B3002A-5350-11D8-9392-000A95A6F0DC@cluster9.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi,

On 30.01.2004, at 19:10, Stephan Szabo wrote:

>
> On Fri, 30 Jan 2004, David Teran wrote:
>
>> select
>> sum(job_property_difference(t0.int_value, t1.int_value)) as rank
>> from
>> job_property t0,
>> job_property t1
>> where
>> t0.id_job_profile = 911
>> and t0.id_job_attribute = t1.id_job_attribute
>> and t1.id_job_profile in (select id_job_profile from unemployed)
>> and t1.id_job_profile <> 911;
>>
>> results in a query plan result:
>
> Can we see explain analyze output for the query, it'll give more
> information about actual time and row counts than plain explain.
>

sure, here it is comes. What we need to achieve is: we have different
job_profiles, each profile has multiple values. For one given profile
we need the ' sum of the distance of every value in the given profile
and every other profile'. The result is usually grouped by the profile
id but to make the query easier i removed this, it does not cost too
much time and it turned out that this query here uses most of the time.

thanks, David


QUERY PLAN
------------------------------------------------------------------------
------------------------------------------------------------------------
--------------------------------
Aggregate (cost=2689349.81..2689349.81 rows=1 width=8) (actual
time=100487.423..100487.423 rows=1 loops=1)
-> Merge Join (cost=2451266.53..2655338.83 rows=13604393 width=8)
(actual time=82899.466..-2371037.726 rows=2091599 loops=1)
Merge Cond: ("outer".id_job_attribute =
"inner".id_job_attribute)
-> Sort (cost=97.43..100.63 rows=1281 width=8) (actual
time=3.937..4.031 rows=163 loops=1)
Sort Key: t0.id_job_attribute
-> Index Scan using
job_property__id_job_profile__fk_index on job_property t0
(cost=0.00..31.31 rows=1281 width=8) (actual time=1.343..3.766 rows=163
loops=1)
Index Cond: (id_job_profile = 911)
-> Sort (cost=2451169.10..2483246.47 rows=12830947 width=8)
(actual time=82891.076..-529619.213 rows=4187378 loops=1)
Sort Key: t1.id_job_attribute
-> Hash IN Join (cost=507.32..439065.37 rows=12830947
width=8) (actual time=61.943..1874640.807 rows=4187378 loops=1)
Hash Cond: ("outer".id_job_profile =
"inner".id_job_profile)
-> Seq Scan on job_property t1
(cost=0.00..246093.84 rows=12830947 width=12) (actual
time=0.136..19101.796 rows=8482533 loops=1)
Filter: (id_job_profile <> 911)
-> Hash (cost=467.46..467.46 rows=15946 width=4)
(actual time=61.313..61.313 rows=0 loops=1)
-> Seq Scan on unemployed
(cost=0.00..467.46 rows=15946 width=4) (actual time=0.157..50.842
rows=15960 loops=1)
Total runtime: 103769.592 ms

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2004-01-30 19:19:26 Re: another query optimization question
Previous Message Stephan Szabo 2004-01-30 18:10:16 Re: another query optimization question