Skip site navigation (1) Skip section navigation (2)

help needed -- sequential scan problem

From: sarlav kumar <sarlavk(at)yahoo(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: help needed -- sequential scan problem
Date: 2004-11-19 22:31:22
Message-ID: 20041119223122.49470.qmail@web51301.mail.yahoo.com (view raw or flat)
Thread:
Lists: pgsql-performance
Hi All,
 
I am new to Postgres.
 
I have a query which does not use index scan unless I force postgres to use index scan. I dont want to force postgres, unless there is no way of optimizing this query.
 
The query :
 
select m.company_name,m.approved,cu.account_no,mbt.business_name,cda.country, 
(select count(*) from merchant_purchase mp left join data d on mp.data_id=d.id where mp.merchant_id=m.id and d.status=5) as Trans_count,
(select sum(total * 0.01) from merchant_purchase mp left join data d on mp.data_id=d.id where mp.merchant_id=m.id and d.status=5) as Trans_amount,
(select count(*) from merchant_purchase mp left join data d on mp.data_id=d.id where d.what=15 and d.status=5 and d.flags=7 and mp.merchant_id=m.id) as Reversal_count
from merchant m 
left join customer cu on cu.id=m.uid 
left join customerdata cda on cda.uid=cu.id 
left join merchant_business_types mbt on mbt.id=m.businesstype and
exists (select distinct(merchant_id) from merchant_purchase where m.id=merchant_id);

 
First Question: I know the way I have written the first two sub-selects is really bad, as they have the same conditions in the where clause. But I am not sure if there is a way to select two columns in a single sub-select query. When I tried to combine the two sub-select queries, I got an error saying that the sub-select can have only one column. Does anyone know any other efficient way of doing it?
 
Second Question: The query plan is as follows:
 
 QUERY PLAN                                                      
                        
-------------------------------------------------------------------------------------------------------------------------------- Hash Join  (cost=901.98..17063.67 rows=619 width=88) (actual time=52.01..5168.09 rows=619 loops=1)
   Hash Cond: ("outer".businesstype = "inner".id)
   Join Filter: (subplan)
   ->  Merge Join  (cost=900.34..1276.04 rows=619 width=62) (actual time=37.00..97.58 rows=619 loops=1)
         Merge Cond: ("outer".id = "inner".uid)
         ->  Merge Join  (cost=900.34..940.61 rows=619 width=52) (actual time=36.91..54.66 rows=619 loops=1)
               Merge Cond: ("outer".id = "inner".uid)
               ->  Sort  (cost=795.45..810.32 rows=5949 width=17) (actual time=32.59..36.59 rows=5964 loops=1)
                     Sort Key: cu.id
                     ->  Seq Scan on customer cu  (cost=0.00..422.49 rows=5949 width=17) (actual time=0.02..15.69 rows=5964 loops=1)
               ->  Sort  (cost=104.89..106.44 rows=619 width=35) (actual time=4.27..5.10 rows=619 loops=1)
                     Sort Key: m.uid
                     ->  Seq Scan on merchant m  (cost=0.00..76.19 rows=619 width=35) (actual time=0.04..2.65 rows=619 loops=1)
         ->  Index Scan using customerdata_uid_idx on customerdata cda  (cost=0.00..311.85 rows=5914 width=10) (actual time=0.09..27.70 rows=5
919 loops=1)
   ->  Hash  (cost=1.51..1.51 rows=51 width=26) (actual time=0.19..0.19 rows=0 loops=1)
         ->  Seq Scan o n merchant_business_types mbt  (cost=0.00..1.51 rows=51 width=26) (actual time=0.04..0.12 rows=51 loops=1)
   SubPlan
     ->  Aggregate  (cost=269.89..269.89 rows=1 width=12) (actual time=2.70..2.70 rows=1 loops=619)
           ->  Nested Loop  (cost=0.00..269.78 rows=44 width=12) (actual time=2.40..2.69 rows=4 loops=619)
                 Filter: ("inner".status = 5)
                 ->  Seq Scan on merchant_purchase mp  (cost=0.00..95.39 rows=44 width=4) (actual time=2.37..2.58 rows=6 loops=619)
                       Filter: (merchant_id = $0)
                 ->  Index Scan using data_pkey on data d  (cost=0.00..3.91 rows=1 width=8) (actual time=0.01..0.01 rows=1 loops=3951)
                       Index Cond: ("outer".data_id = d.id)
     ->  Aggregate  (cost=269.89..269.89 rows=1 width=16) (actual time=2.73..2.73 rows=1 loops=619)
           ->  Nested Loop  (cost=0.00..269.78 rows=44 width=16) (actual time=2.42..2.70 rows=4 loops=619)
                 Filter: ("inner".status = 5)
                 ->  Seq Scan on merchant_purchase m p  (cost=0.00..95.39 rows=44 width=8) (actual time=2.39..2.60 rows=6 loops=619)
                       Filter: (merchant_id = $0)
                 ->  Index Scan using data_pkey on data d  (cost=0.00..3.91 rows=1 width=8) (actual time=0.01..0.01 rows=1 loops=3951)
                       Index Cond: ("outer".data_id = d.id)
     ->  Aggregate  (cost=270.12..270.12 rows=1 width=20) (actual time=2.72..2.72 rows=1 loops=619)
           ->  Nested Loop  (cost=0.00..270.00 rows=44 width=20) (actual time=2.63..2.72 rows=0 loops=619)
                 Filter: (("inner".what = 15) AND ("inner".status = 5) AND ("inner".flags = 7))
                 ->  Seq Scan on merchant_purchase mp  (cost=0.00..95.39 rows=44 width=4) (actual time=2.40..2.62 rows=6 loops=619)
                       Filter: (merchant_id = $0)
                 ->  Index Scan using data_pkey on data d  (cost=0.00..3.91 rows=1 width=16) (actual time=0.01..0.01 rows=1 loops=3951)
                       Index Cond: ("outer".data_id = d.id)
     ->  Unique  (cost=0.00..113.14 rows=4 width=4) (actual time=0.02..0.02 rows=0 loops=598)
           ->  Index Scan using merchant_purchase_merchant_id_idx on merchant_purchase  (cost=0.00..113.02 rows=44 width=4) (actual time=0.01.
.0.01 rows=0 loops=598)
                 Index Cond: ($0 = merchant_id)
 Total runtime: 5170.37 msec (5.170 sec)
(42 rows)

 
As you can see, there are many sequential scans in the query plan. Postgres is not using the index defined, even though it leads to better performance(0.2 sec!! when i force index scan)
 
Is there something wrong in my query that makes postgres use seq scan as opposed to index scan?? Any help would be really appreciated.
 
Thanks for you time and help!
Saranya

		
---------------------------------
Do you Yahoo!?
 Discover all thats new in My Yahoo!

Responses

pgsql-performance by date

Next:From: David ParkerDate: 2004-11-19 22:32:11
Subject: tablespace + RAM disk?
Previous:From: Andrew JanianDate: 2004-11-19 21:22:02
Subject: Re: Query Performance and IOWait

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group