Skip site navigation (1) Skip section navigation (2)

Strange left outer join performance issue

From: "Noah M(dot) Daniels" <ndaniels(at)mac(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Strange left outer join performance issue
Date: 2007-03-23 19:44:44
Message-ID: 5335C295-4D3E-4A1B-A270-6B822377DD25@mac.com (view raw or flat)
Thread:
Lists: pgsql-performance
Hi,

I have two queries that are very similar, that run on the same table  
with slightly different conditions. However, despite a similar number  
of rows returned, the query planner is insisting on a different  
ordering and different join algorithm, causing a huge performance  
hit. I'm not sure why the planner is doing the merge join the way it  
is in the slow case, rather than following a similar plan to the fast  
case.

Notice that the difference in the query is near the very end, where  
it's supplier_alias_id vs. buyer_alias_id and company_type =  
'Supplier' vs 'Buyer'.

What I don't get is why, in the slow (supplier) case, the index scan  
on customs_records is done first without the index condition of  
cr.supplier_alias_id = "outer".id, which means selecting 1.7 million  
rows; why wouldn't it do a nested loop left join and have the index  
condition use that alias id the way the fast ('buyer') query is done?

I'd appreciate any help -- thanks!

SLOW:

select a.id as alias_id, a.company_type as alias_company_type, a.name  
as alias_name, cr.shipper as customs_record_shipper, cr.saddr1 as  
customs_record_saddr1, cr.saddr2 as customs_record_saddr2, cr.saddr3  
as customs_record_saddr3, cr.consignee as customs_record_consignee,  
cr.caddr1 as customs_record_caddr1, cr.caddr2 as  
customs_record_caddr2, cr.caddr3 as customs_record_caddr3,  
cr.notify_party as customs_record_notify_party, cr.naddr1 as  
customs_record_naddr1, cr.naddr2 as customs_record_naddr2, cr.naddr3  
as customs_record_naddr3, cr.also_notify_party as  
customs_record_also_notify_party, cr.anaddr1 as  
customs_record_anaddr1, cr.anaddr2 as customs_record_anaddr2,  
cr.anaddr3 as customs_record_addr3, cr.id as customs_record_id,  
cr.buyer_field as customs_record_buyer_field from aliases a left  
outer join customs_records cr on cr.supplier_alias_id = a.id where  
a.company_type = 'Supplier' and a.company_id is NULL


Merge Right Join  (cost=1138.78..460482.84 rows=2993 width=405)  
(actual time=1244745.427..1245714.571 rows=39 loops=1)
   Merge Cond: ("outer".supplier_alias_id = "inner".id)
   ->  Index Scan using index_customs_records_on_supplier_alias_id on  
customs_records cr  (cost=0.00..6717806.37 rows=1704859 width=363)  
(actual time=54.567..1245210.707 rows=117424 loops=1)
   ->  Sort  (cost=1138.78..1139.53 rows=300 width=46) (actual  
time=24.093..24.161 rows=39 loops=1)
         Sort Key: a.id
         ->  Index Scan using index_aliases_company_type_company_id  
on aliases a  (cost=0.00..1126.44 rows=300 width=46) (actual  
time=22.400..23.959 rows=10 loops=1)
               Index Cond: ((company_type)::text = 'Supplier'::text)
               Filter: (company_id IS NULL)
Total runtime: 1245714.752 ms

FAST:

Nested Loop Left Join  (cost=0.00..603052.46 rows=3244 width=405)  
(actual time=68.526..3115.407 rows=1355 loops=1)
    ->  Index Scan using index_aliases_company_type_company_id on  
aliases a  (cost=0.00..639.56 rows=165 width=46) (actual  
time=32.419..132.286 rows=388 loops=1)
          Index Cond: ((company_type)::text = 'Buyer'::text)
          Filter: (company_id IS NULL)
    ->  Index Scan using index_customs_records_on_buyer_alias_id on  
customs_records cr  (cost=0.00..3639.55 rows=915 width=363) (actual  
time=2.133..7.649 rows=3 loops=388)
          Index Cond: (cr.buyer_alias_id = "outer".id)
Total runtime: 3117.713 ms
(7 rows)

select a.id as alias_id, a.company_type as alias_company_type, a.name  
as alias_name, cr.shipper as customs_record_shipper, cr.saddr1 as  
customs_record_saddr1, cr.saddr2 as customs_record_saddr2, cr.saddr3  
as customs_record_saddr3, cr.consignee as customs_record_consignee,  
cr.caddr1 as customs_record_caddr1, cr.caddr2 as  
customs_record_caddr2, cr.caddr3 as customs_record_caddr3,  
cr.notify_party as customs_record_notify_party, cr.naddr1 as  
customs_record_naddr1, cr.naddr2 as customs_record_naddr2, cr.naddr3  
as customs_record_naddr3, cr.also_notify_party as  
customs_record_also_notify_party, cr.anaddr1 as  
customs_record_anaddr1, cr.anaddr2 as customs_record_anaddr2,  
cr.anaddr3 as customs_record_addr3, cr.id as customs_record_id,  
cr.buyer_field as customs_record_buyer_field from aliases a left  
outer join customs_records cr on cr.buyer_alias_id = a.id where  
a.company_type = 'Buyer' and a.company_id is NULL


Responses

pgsql-performance by date

Next:From: Daniel Cristian CruzDate: 2007-03-23 20:04:51
Subject: Re: Strange left outer join performance issue
Previous:From: Michael StoneDate: 2007-03-23 17:13:02
Subject: Re: Parallel Vacuum

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group