Re: Improving Inner Join Performance

From: "Andy" <frum(at)ar-sd(dot)net>
To: "Pandurangan R S" <pandurangan(dot)r(dot)s(at)gmail(dot)com>
Cc: <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Improving Inner Join Performance
Date: 2006-01-06 10:21:25
Message-ID: 00c001c612aa$f39ed1d0$0b00a8c0@forge
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

At the moment: o.id_status>3 can have values only 4 and 6. The 6 is around
90% from the whole table. This is why seq scan is made.

Now, depending on the user input the query can have more where fields. For
example:
SELECT count(*) FROM orders o
INNER JOIN report r ON r.id_order=o.id
WHERE o.id_status > 3 AND r.id_zufriden=7

Aggregate (cost=7317.15..7317.15 rows=1 width=0) (actual
time=213.418..213.419 rows=1 loops=1)
-> Hash Join (cost=3139.00..7310.80 rows=2540 width=0) (actual
time=57.554..212.215 rows=1308 loops=1)
Hash Cond: ("outer".id = "inner".id_order)
-> Seq Scan on orders o (cost=0.00..3785.31 rows=72216 width=4)
(actual time=0.014..103.292 rows=72121 loops=1)
Filter: (id_status > 3)
-> Hash (cost=3132.51..3132.51 rows=2597 width=4) (actual
time=57.392..57.392 rows=0 loops=1)
-> Seq Scan on report r (cost=0.00..3132.51 rows=2597
width=4) (actual time=0.019..56.220 rows=1308 loops=1)
Filter: (id_zufriden = 7)
Total runtime: 213.514 ms

These examples can go on and on.

If I run this query
SELECT count(*) FROM orders o
INNER JOIN report r ON r.id_order=o.id
WHERE o.id_status>3
under normal system load the average response time is between 1.3 > 2.5
seconds. Sometimes even more. If I run it rapidly a few times then it
respondes faster(that is normal I supose).

The ideea of this query is to count all the possible results that the user
can have. I use this to build pages of results.

Andy.

----- Original Message -----
From: "Pandurangan R S" <pandurangan(dot)r(dot)s(at)gmail(dot)com>
To: "Andy" <frum(at)ar-sd(dot)net>
Cc: <pgsql-performance(at)postgresql(dot)org>
Sent: Friday, January 06, 2006 11:56 AM
Subject: Re: [PERFORM] Improving Inner Join Performance

> If the users puts in some other search fields on the where then the query
> runs faster but > in this format sometimes it takes a lot lot of
> time(sometimes even 2,3 seconds).

Can you eloborate under what conditions which query is slower?

On 1/5/06, Andy <frum(at)ar-sd(dot)net> wrote:
>
> Hi to all,
>
> I have the following query:
>
> SELECT count(*) FROM orders o
> INNER JOIN report r ON r.id_order=o.id
> WHERE o.id_status>3
>
> Explaing analyze:
> Aggregate (cost=8941.82..8941.82 rows=1 width=0) (actual
> time=1003.297..1003.298 rows=1 loops=1)
> -> Hash Join (cost=3946.28..8881.72 rows=24041 width=0) (actual
> time=211.985..951.545 rows=72121 loops=1)
> Hash Cond: ("outer".id_order = "inner".id)
> -> Seq Scan on report r (cost=0.00..2952.21 rows=72121 width=4)
> (actual time=0.005..73.869 rows=72121 loops=1)
> -> Hash (cost=3787.57..3787.57 rows=24682 width=4) (actual
> time=211.855..211.855 rows=0 loops=1)
> -> Seq Scan on orders o (cost=0.00..3787.57 rows=24682
> width=4) (actual time=0.047..147.170 rows=72121 loops=1)
> Filter: (id_status > 3)
> Total runtime: 1003.671 ms
>
>
> I could use it in the following format, because I have to the moment only
> the 4,6 values for the id_status.
>
> SELECT count(*) FROM orders o
> INNER JOIN report r ON r.id_order=o.id
> WHERE o.id_status IN (4,6)
>
> Explain analyze:
> Aggregate (cost=5430.04..5430.04 rows=1 width=0) (actual
> time=1472.877..1472.877 rows=1 loops=1)
> -> Hash Join (cost=2108.22..5428.23 rows=720 width=0) (actual
> time=342.080..1419.775 rows=72121 loops=1)
> Hash Cond: ("outer".id_order = "inner".id)
> -> Seq Scan on report r (cost=0.00..2952.21 rows=72121 width=4)
> (actual time=0.036..106.217 rows=72121 loops=1)
> -> Hash (cost=2106.37..2106.37 rows=739 width=4) (actual
> time=342.011..342.011 rows=0 loops=1)
> -> Index Scan using orders_id_status_idx,
> orders_id_status_idx on orders o (cost=0.00..2106.37 rows=739 width=4)
> (actual time=0.131..268.397 rows=72121 loops=1)
> Index Cond: ((id_status = 4) OR (id_status = 6))
> Total runtime: 1474.356 ms
>
> How can I improve this query's performace?? The ideea is to count all the
> values that I have in the database for the following conditions. If the
> users puts in some other search fields on the where then the query runs
> faster but in this format sometimes it takes a lot lot of time(sometimes
> even 2,3 seconds).
>
> Can this be tuned somehow???
>
> Regards,
> Andy.
>
>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Ian Westmacott 2006-01-06 14:00:06 Re: improving write performance for logging
Previous Message Pandurangan R S 2006-01-06 09:56:31 Re: Improving Inner Join Performance