Re: Redundant sub query triggers slow nested loop left join

From: "henk de wit" <henk53602(at)hotmail(dot)com>
To: tgl(at)sss(dot)pgh(dot)pa(dot)us
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Redundant sub query triggers slow nested loop left join
Date: 2007-04-22 20:22:22
Message-ID: BAY106-F1ED861D02B3FF316BFF68F5540@phx.gbl
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-performance


>In the actual event, with
>359 rows out of the scan, the nestloop way is just horrid because it
>repeats the other side 359 times :-(

Indeed. :(

Btw, I tried to apply the removal of the redundant check in the larger query
(the one from which I extracted the part shown earlier) but it only performs
worse after that. The more redundant checks I remove, the slower the query
gets. I figure the original designer of the query inserted those checks to
quickly limit the number of rows involved in the nested loop. Of course, the
problem is probably not the number of rows involved, but the unfortunate
choice of the nested loop.

I spend a few hours today in trying to figure it all out, but I'm pretty
stuck at the moment.

For what its worth, this is the plan PG 8.2 comes up with right after I
remove the same check that made the isolated query in the openings post so
much faster:

Sort (cost=6006.54..6006.55 rows=1 width=597) (actual
time=14561.499..14561.722 rows=553 loops=1)
Sort Key: public.banners_links.id
-> Nested Loop Left Join (cost=3917.68..6006.53 rows=1 width=597)
(actual time=64.723..14559.811 rows=553 loops=1)
Join Filter: (public.banners_links.id =
public.fetch_banners.banners_links_id)
-> Nested Loop Left Join (cost=3917.68..6002.54 rows=1 width=527)
(actual time=64.607..14509.291 rows=553 loops=1)
Join Filter: (public.banners_links.id =
reward_ratings.banner_id)
-> Nested Loop Left Join (cost=2960.36..4395.12 rows=1
width=519) (actual time=52.761..8562.575 rows=553 loops=1)
Join Filter: (public.banners_links.id =
banners_banner_types.banner_id)
-> Nested Loop Left Join (cost=2000.60..2956.57 rows=1
width=484) (actual time=32.026..304.700 rows=359 loops=1)
Join Filter: (public.banners_links.id =
ecpc_per_banner_link.banners_links_id)
-> Nested Loop (cost=124.58..1075.70 rows=1
width=468) (actual time=9.793..187.724 rows=359 loops=1)
-> Nested Loop Left Join
(cost=124.58..1067.42 rows=1 width=89) (actual time=9.786..184.671 rows=359
loops=1)
Join Filter: (public.banners_links.id
= users_banners_tot_sub.banner_id)
-> Hash Left Join
(cost=107.97..1050.78 rows=1 width=81) (actual time=6.119..7.605 rows=359
loops=1)
Hash Cond:
(public.banners_links.id = special_deals.id)
Filter:
(special_deals.special_deal IS NULL)
-> Bitmap Heap Scan on
banners_links (cost=11.43..954.03 rows=2 width=73) (actual
time=0.128..1.069 rows=359 loops=1)
Recheck Cond: (merchant_id
= 5631)
Filter: ((status)::text =
'0'::text)
-> Bitmap Index Scan on
banners_links_merchant_id_idx (cost=0.00..11.43 rows=424 width=0) (actual
time=0.089..0.089 rows=424 loops=1)
Index Cond:
(merchant_id = 5631)
-> Hash (cost=86.93..86.93
rows=769 width=16) (actual time=5.982..5.982 rows=780 loops=1)
-> Subquery Scan
special_deals (cost=69.62..86.93 rows=769 width=16) (actual
time=4.179..5.414 rows=780 loops=1)
-> HashAggregate
(cost=69.62..79.24 rows=769 width=16) (actual time=4.179..4.702 rows=780
loops=1)
-> Seq Scan
on banner_deals (cost=0.00..53.75 rows=3175 width=16) (actual
time=0.006..1.480 rows=3175 loops=1)
-> HashAggregate (cost=16.61..16.62
rows=1 width=24) (actual time=0.011..0.292 rows=424 loops=359)
-> Nested Loop
(cost=0.00..16.60 rows=1 width=24) (actual time=0.029..3.096 rows=424
loops=1)
-> Index Scan using
users_banners_affiliate_id_idx on users_banners (cost=0.00..8.30 rows=1
width=16) (actual time=0.021..0.523 rows=424 loops=1)
Index Cond:
((affiliate_id = 5631) AND (affiliate_id = 5631))
Filter:
((status)::text = '3'::text)
-> Index Scan using
users_banners_id_idx on users_banners_rotation (cost=0.00..8.29 rows=1
width=16) (actual time=0.003..0.004 rows=1 loops=424)
Index Cond:
(users_banners_rotation.users_banners_id = users_banners.id)
-> Index Scan using
banners_org_id_banner.idx on banners_org (cost=0.00..8.27 rows=1 width=387)
(actual time=0.005..0.006 rows=1 loops=359)
Index Cond: (public.banners_links.id =
banners_org.id_banner)
-> Sort (cost=1876.01..1876.50 rows=194
width=30) (actual time=0.062..0.183 rows=290 loops=359)
Sort Key: CASE WHEN
(precalculated_stats_banners_links.clicks_total > 0) THEN
(((precalculated_stats_banners_links.revenue_total_affiliate /
(precalculated_stats_banners_links.clicks_total)::numeric))::double
precision / 1000::double precision) ELSE 0::double precision END
-> Merge IN Join (cost=1819.78..1868.64
rows=194 width=30) (actual time=16.701..21.797 rows=290 loops=1)
Merge Cond:
(precalculated_stats_banners_links.banners_links_id =
public.banners_links.id)
-> Sort (cost=849.26..869.24
rows=7993 width=30) (actual time=12.416..15.717 rows=7923 loops=1)
Sort Key:
precalculated_stats_banners_links.banners_links_id
-> Index Scan using
pre_calc_banners_status on precalculated_stats_banners_links
(cost=0.00..331.13 rows=7993 width=30) (actual time=0.006..6.209 rows=7923
loops=1)
Index Cond: (status = 4)
-> Sort (cost=970.52..971.58
rows=424 width=8) (actual time=0.885..1.049 rows=366 loops=1)
Sort Key:
public.banners_links.id
-> Bitmap Heap Scan on
banners_links (cost=11.54..952.02 rows=424 width=8) (actual
time=0.121..0.549 rows=424 loops=1)
Recheck Cond: (merchant_id
= 5631)
-> Bitmap Index Scan on
banners_links_merchant_id_idx (cost=0.00..11.43 rows=424 width=0) (actual
time=0.087..0.087 rows=424 loops=1)
Index Cond:
(merchant_id = 5631)
-> Hash Join (cost=959.77..1432.13 rows=514 width=43)
(actual time=0.900..22.684 rows=658 loops=359)
Hash Cond: (banners_banner_types.type_id =
banner_types.id)
-> Hash IN Join (cost=957.32..1422.52 rows=540
width=16) (actual time=0.898..21.944 rows=658 loops=359)
Hash Cond: (banners_banner_types.banner_id =
public.banners_links.id)
-> Seq Scan on banners_banner_types
(cost=0.00..376.40 rows=22240 width=16) (actual time=0.004..10.184
rows=22240 loops=359)
-> Hash (cost=952.02..952.02 rows=424
width=8) (actual time=0.751..0.751 rows=424 loops=1)
-> Bitmap Heap Scan on banners_links
(cost=11.54..952.02 rows=424 width=8) (actual time=0.127..0.470 rows=424
loops=1)
Recheck Cond: (merchant_id =
5631)
-> Bitmap Index Scan on
banners_links_merchant_id_idx (cost=0.00..11.43 rows=424 width=0) (actual
time=0.086..0.086 rows=424 loops=1)
Index Cond: (merchant_id =
5631)
-> Hash (cost=2.20..2.20 rows=20 width=43)
(actual time=0.037..0.037 rows=20 loops=1)
-> Seq Scan on banner_types
(cost=0.00..2.20 rows=20 width=43) (actual time=0.004..0.015 rows=20
loops=1)
-> Hash IN Join (cost=957.32..1606.26 rows=93 width=16)
(actual time=10.751..10.751 rows=0 loops=553)
Hash Cond: (reward_ratings.banner_id =
public.banners_links.id)
-> Seq Scan on reward_ratings (cost=0.00..633.66
rows=3827 width=16) (actual time=0.007..8.770 rows=4067 loops=553)
Filter: ((now() >= period_start) AND (now() <=
period_end))
-> Hash (cost=952.02..952.02 rows=424 width=8) (actual
time=0.747..0.747 rows=424 loops=1)
-> Bitmap Heap Scan on banners_links
(cost=11.54..952.02 rows=424 width=8) (actual time=0.120..0.472 rows=424
loops=1)
Recheck Cond: (merchant_id = 5631)
-> Bitmap Index Scan on
banners_links_merchant_id_idx (cost=0.00..11.43 rows=424 width=0) (actual
time=0.088..0.088 rows=424 loops=1)
Index Cond: (merchant_id = 5631)
-> Seq Scan on fetch_banners (cost=0.00..2.88 rows=88 width=78)
(actual time=0.003..0.042 rows=88 loops=553)
Total runtime: 14562.251 ms

The same check (merchant_id = 5631) still appears at 5 other places in the
query. If I remove one other, the query goes to 20 seconds, if I then remove
one other again it goes to 28 seconds, etc all the way to more than 40
seconds. I understand the above looks like a complicated mess, but would you
have any pointers of what I could possibly do next to force a better plan?

>Check the archives for mention of equivalence classes, notably these
>two threads:
>http://archives.postgresql.org/pgsql-hackers/2007-01/msg00568.php
>http://archives.postgresql.org/pgsql-hackers/2007-01/msg00826.php

I'm going to read those. Thanks for the references.

_________________________________________________________________
Play online games with your friends with Messenger
http://www.join.msn.com/messenger/overview

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2007-04-22 20:42:33 Re: Redundant sub query triggers slow nested loop left join
Previous Message Tom Lane 2007-04-22 17:53:26 Re: Redundant sub query triggers slow nested loop left join