Strange left outer join performance issue

From: "Noah M(dot) Daniels" <ndaniels(at)mac(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Strange left outer join performance issue
Date: 2007-03-23 19:44:44
Message-ID: 5335C295-4D3E-4A1B-A270-6B822377DD25@mac.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi,

I have two queries that are very similar, that run on the same table
with slightly different conditions. However, despite a similar number
of rows returned, the query planner is insisting on a different
ordering and different join algorithm, causing a huge performance
hit. I'm not sure why the planner is doing the merge join the way it
is in the slow case, rather than following a similar plan to the fast
case.

Notice that the difference in the query is near the very end, where
it's supplier_alias_id vs. buyer_alias_id and company_type =
'Supplier' vs 'Buyer'.

What I don't get is why, in the slow (supplier) case, the index scan
on customs_records is done first without the index condition of
cr.supplier_alias_id = "outer".id, which means selecting 1.7 million
rows; why wouldn't it do a nested loop left join and have the index
condition use that alias id the way the fast ('buyer') query is done?

I'd appreciate any help -- thanks!

SLOW:

select a.id as alias_id, a.company_type as alias_company_type, a.name
as alias_name, cr.shipper as customs_record_shipper, cr.saddr1 as
customs_record_saddr1, cr.saddr2 as customs_record_saddr2, cr.saddr3
as customs_record_saddr3, cr.consignee as customs_record_consignee,
cr.caddr1 as customs_record_caddr1, cr.caddr2 as
customs_record_caddr2, cr.caddr3 as customs_record_caddr3,
cr.notify_party as customs_record_notify_party, cr.naddr1 as
customs_record_naddr1, cr.naddr2 as customs_record_naddr2, cr.naddr3
as customs_record_naddr3, cr.also_notify_party as
customs_record_also_notify_party, cr.anaddr1 as
customs_record_anaddr1, cr.anaddr2 as customs_record_anaddr2,
cr.anaddr3 as customs_record_addr3, cr.id as customs_record_id,
cr.buyer_field as customs_record_buyer_field from aliases a left
outer join customs_records cr on cr.supplier_alias_id = a.id where
a.company_type = 'Supplier' and a.company_id is NULL

Merge Right Join (cost=1138.78..460482.84 rows=2993 width=405)
(actual time=1244745.427..1245714.571 rows=39 loops=1)
Merge Cond: ("outer".supplier_alias_id = "inner".id)
-> Index Scan using index_customs_records_on_supplier_alias_id on
customs_records cr (cost=0.00..6717806.37 rows=1704859 width=363)
(actual time=54.567..1245210.707 rows=117424 loops=1)
-> Sort (cost=1138.78..1139.53 rows=300 width=46) (actual
time=24.093..24.161 rows=39 loops=1)
Sort Key: a.id
-> Index Scan using index_aliases_company_type_company_id
on aliases a (cost=0.00..1126.44 rows=300 width=46) (actual
time=22.400..23.959 rows=10 loops=1)
Index Cond: ((company_type)::text = 'Supplier'::text)
Filter: (company_id IS NULL)
Total runtime: 1245714.752 ms

FAST:

Nested Loop Left Join (cost=0.00..603052.46 rows=3244 width=405)
(actual time=68.526..3115.407 rows=1355 loops=1)
-> Index Scan using index_aliases_company_type_company_id on
aliases a (cost=0.00..639.56 rows=165 width=46) (actual
time=32.419..132.286 rows=388 loops=1)
Index Cond: ((company_type)::text = 'Buyer'::text)
Filter: (company_id IS NULL)
-> Index Scan using index_customs_records_on_buyer_alias_id on
customs_records cr (cost=0.00..3639.55 rows=915 width=363) (actual
time=2.133..7.649 rows=3 loops=388)
Index Cond: (cr.buyer_alias_id = "outer".id)
Total runtime: 3117.713 ms
(7 rows)

select a.id as alias_id, a.company_type as alias_company_type, a.name
as alias_name, cr.shipper as customs_record_shipper, cr.saddr1 as
customs_record_saddr1, cr.saddr2 as customs_record_saddr2, cr.saddr3
as customs_record_saddr3, cr.consignee as customs_record_consignee,
cr.caddr1 as customs_record_caddr1, cr.caddr2 as
customs_record_caddr2, cr.caddr3 as customs_record_caddr3,
cr.notify_party as customs_record_notify_party, cr.naddr1 as
customs_record_naddr1, cr.naddr2 as customs_record_naddr2, cr.naddr3
as customs_record_naddr3, cr.also_notify_party as
customs_record_also_notify_party, cr.anaddr1 as
customs_record_anaddr1, cr.anaddr2 as customs_record_anaddr2,
cr.anaddr3 as customs_record_addr3, cr.id as customs_record_id,
cr.buyer_field as customs_record_buyer_field from aliases a left
outer join customs_records cr on cr.buyer_alias_id = a.id where
a.company_type = 'Buyer' and a.company_id is NULL

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Daniel Cristian Cruz 2007-03-23 20:04:51 Re: Strange left outer join performance issue
Previous Message Michael Stone 2007-03-23 17:13:02 Re: Parallel Vacuum