Skip site navigation (1) Skip section navigation (2)

same plan, different time

From: Litao Wu <litaowu(at)yahoo(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: same plan, different time
Date: 2004-07-16 21:36:30
Message-ID: 20040716213630.27909.qmail@web13125.mail.yahoo.com (view raw or flat)
Thread:
Lists: pgsql-performance
Hi,

I have a query, which runs fast for one id (query 1)
and slow for other id (query 2)
though both plans and cost are same except
these two qeries return different number of rows.

explain analyze
SELECT *
FROM user U LEFT JOIN user_timestamps T USING
(user_id), user_alias A
WHERE U.user_id = A.user_id  AND A.domain_id=7551070;

\g
                                                      
                       QUERY PLAN                     
               
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Merge Join  (cost=234.22..61015.98 rows=12 width=238)
(actual time=7.73..7.73 rows=0 loops=1)
   Merge Cond: ("outer".user_id = "inner".user_id)
   ->  Merge Join  (cost=0.00..58585.67 rows=909864
width=180) (actual time=0.07..0.07 rows=1 loops=1)
         Merge Cond: ("outer".user_id =
"inner".user_id)
         ->  Index Scan using user_pkey on user u 
(cost=0.00..29714.99 rows=909864 width=156) (actual
time=0.04..0.04 rows=1 loops=1)
         ->  Index Scan using user_timestamps_uid_idx
on user_timestamps t  (cost=0.00..16006.05 rows=706896
width=24) (actual time=0.02..0.02 rows=1 loops=1)
   ->  Sort  (cost=234.22..234.25 rows=12 width=58)
(actual time=7.65..7.65 rows=0 loops=1)
         Sort Key: a.user_id
         ->  Seq Scan on user_alias a 
(cost=0.00..234.00 rows=12 width=58) (actual
time=7.61..7.61 rows=0 loops=1)
               Filter: (domain_id = 7551070)
 Total runtime: 7.96 msec
(11 rows)

explain analyze
SELECT *
FROM user U LEFT JOIN user_timestamps T USING
(user_id), user_alias A
WHERE U.user_id = A.user_id  AND
A.domain_id=2005921193;
\g
                                                      
                            QUERY PLAN                
               
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Merge Join  (cost=247.92..61035.28 rows=332
width=238) (actual time=94511.70..95127.94 rows=493
loops=1)
   Merge Cond: ("outer".user_id = "inner".user_id)
   ->  Merge Join  (cost=0.00..58585.67 rows=909864
width=180) (actual time=6.43..93591.06 rows=897655
loops=1)
         Merge Cond: ("outer".user_id =
"inner".user_id)
         ->  Index Scan using user_pkey on user u 
(cost=0.00..29714.99 rows=909864 width=156) (actual
time=6.29..55634.85 rows=897655 loops=1)
         ->  Index Scan using user_timestamps_uid_idx
on user_timestamps t  (cost=0.00..16006.05 rows=706896
width=24) (actual time=0.10..20331.13 rows=700466
loops=1)
   ->  Sort  (cost=247.92..248.75 rows=332 width=58)
(actual time=10.76..11.17 rows=493 loops=1)
         Sort Key: a.user_id
         ->  Seq Scan on user_alias a 
(cost=0.00..234.00 rows=332 width=58) (actual
time=7.43..9.86 rows=493 loops=1)
               Filter: (domain_id = 2005921193)
 Total runtime: 95128.74 msec
(11 rows)

I also know if I change the order of 2nd query, it
will run much faster:

explain analyze
SELECT *
FROM (user_alias A JOIN user U USING (user_id) ) LEFT
JOIN user_timestamps T USING (user_id)
WHERE A.domain_id=2005921193;
\g
                                                      
                 QUERY PLAN                           
               
----------------------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=0.00..2302.31 rows=332 width=238)
(actual time=15.32..256.54 rows=493 loops=1)
   ->  Nested Loop  (cost=0.00..1263.43 rows=332
width=214) (actual time=15.17..130.58 rows=493
loops=1)
         ->  Seq Scan on user_alias a 
(cost=0.00..234.00 rows=332 width=58) (actual
time=15.04..21.01 rows=493 loops=1)
               Filter: (domain_id = 2005921193)
         ->  Index Scan using user_pkey on user u 
(cost=0.00..3.08 rows=1 width=156) (actual
time=0.17..0.17 rows=1 loops=493)
               Index Cond: ("outer".user_id =
u.user_id)
   ->  Index Scan using user_timestamps_uid_idx on
user_timestamps t  (cost=0.00..3.11 rows=1 width=24)
(actual time=0.16..0.23 rows=1 loops=493)
         Index Cond: ("outer".user_id = t.user_id)
 Total runtime: 257.79 msec
(9 rows)



user with 911932 rows
  user_id - PK

user_timestamps with 708851 rows
  user_id - FK with index 

user_alias with 9689 rows
  user_id - FK with index
  domain_id - no index on this column

My questions are:
1. Why 1st "Merge Join" in 2nd query gets actual
rows=897655 while 1st "Merge Join" in 1st query is
actual rows=1?

If I know the answer, I will understand:
Why 1st "Merge Join" in 2nd query took so longer time
than 1st "Merge Join" in 1st query?

2. Why PG optimzer is not smart enough to use 3rd
(nested Loop) plan?

Thanks,




		
__________________________________
Do you Yahoo!?
Vote for the stars of Yahoo!'s next ad campaign!
http://advision.webevents.yahoo.com/yahoo/votelifeengine/


Responses

pgsql-performance by date

Next:From: Rod TaylorDate: 2004-07-17 02:31:36
Subject: Re: extrem bad performance
Previous:From: Hervé PiedvacheDate: 2004-07-16 11:17:02
Subject: Re: Insert are going slower ...

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group