Why do my hash joins turn to nested loops?

From: pgsql-performance(at)ian(dot)org
To: pgsql-performance(at)postgresql(dot)org
Subject: Why do my hash joins turn to nested loops?
Date: 2008-08-21 22:50:35
Message-ID: Pine.LNX.4.64.0808211837250.4498@www.sportsmogul.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

I only have a few days of experience with postgres and it is working
great, but when I start building up test queries I ran into a problem I
don't understand.

One query works fast, returning results in under a second. If I insert
one more join into the table however, it switches to nested-loops and
takes minutes. It does this randomly.. using integer compares in the
query lines is fast for instance, but using LIKE operators causes it to
use loops again.

I have about 200,000 items per table, so nested loops cause lots of pain.
The database has objects with lots (100 to 1000) 'specs' for each object
in another table, so I have to pull them out to do queries and sorting on
them.

Here are the two queries. They are the same except the first has
two 'find' joins and the other has three.

I assume that I am hitting some limit somewhere that is causing postgres
to change it's idea on how to run the query. Can I force it to use hash
joins?

Thanks!

So this is fast...

EXPLAIN ANALYZE SELECT * FROM logical
LEFT JOIN model ON logical.uid = model.logical_uid
LEFT JOIN company ON model.company_uid = company.uid
LEFT JOIN type ON logical.type::INT = type.uid
JOIN specs spec_find1 ON spec_find1 .spec_uid='8' AND spec_find1 .text LIKE '%' AND spec_find1 .logical_uid=model.logical_uid
JOIN specs spec_find2 ON spec_find2.spec_uid='8' AND spec_find2.text LIKE '%' AND spec_find2.logical_uid=model.logical_uid
LEFT JOIN specs specs_sort ON specs_sort.spec_uid='4' AND specs_sort.logical_uid=logical.uid
ORDER BY specs_sort.num;

QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=5398.43..5398.44 rows=6 width=247) (actual time=331.546..333.303 rows=3555 loops=1)
Sort Key: specs_sort.num
Sort Method: quicksort Memory: 981kB
-> Hash Left Join (cost=1087.68..5398.35 rows=6 width=247) (actual time=37.309..315.451 rows=3555 loops=1)
Hash Cond: (model.company_uid = company.uid)
-> Hash Left Join (cost=1086.28..5396.86 rows=6 width=217) (actual time=37.235..308.787 rows=3555 loops=1)
Hash Cond: (logical.uid = specs_sort.logical_uid)
-> Hash Left Join (cost=694.84..5004.96 rows=6 width=180) (actual time=22.433..284.832 rows=3555 loops=1)
Hash Cond: ((logical.type)::integer = type.uid)
-> Nested Loop (cost=693.55..5003.62 rows=6 width=168) (actual time=22.361..273.502 rows=3555 loops=1)
-> Hash Join (cost=693.55..4953.84 rows=6 width=110) (actual time=22.330..237.717 rows=3555 loops=1)
Hash Cond: (model.logical_uid = spec_find1.logical_uid)
-> Seq Scan on model (cost=0.00..3337.82 rows=184182 width=36) (actual time=0.017..99.289 rows=184182 loops=1)
-> Hash (cost=691.60..691.60 rows=156 width=74) (actual time=21.795..21.795 rows=2196 loops=1)
-> Hash Join (cost=339.84..691.60 rows=156 width=74) (actual time=8.558..19.060 rows=2196 loops=1)
Hash Cond: (spec_find1.logical_uid = spec_find2.logical_uid)
-> Seq Scan on specs spec_find1 (cost=0.00..326.89 rows=1036 width=37) (actual time=0.023..6.765 rows=2196 loops=1)
Filter: (((text)::text ~~ '%'::text) AND (spec_uid = 8))
-> Hash (cost=326.89..326.89 rows=1036 width=37) (actual time=8.508..8.508 rows=2196 loops=1)
-> Seq Scan on specs spec_find2 (cost=0.00..326.89 rows=1036 width=37) (actual time=0.010..6.667 rows=2196 loops=1)
Filter: (((text)::text ~~ '%'::text) AND (spec_uid = 8))
-> Index Scan using logical_pkey on logical (cost=0.00..8.28 rows=1 width=58) (actual time=0.006..0.007 rows=1 loops=3555)
Index Cond: (logical.uid = model.logical_uid)
-> Hash (cost=1.13..1.13 rows=13 width=12) (actual time=0.024..0.024 rows=13 loops=1)
-> Seq Scan on type (cost=0.00..1.13 rows=13 width=12) (actual time=0.004..0.011 rows=13 loops=1)
-> Hash (cost=287.57..287.57 rows=8309 width=37) (actual time=14.773..14.773 rows=8172 loops=1)
-> Seq Scan on specs specs_sort (cost=0.00..287.57 rows=8309 width=37) (actual time=0.012..8.206 rows=8172 loops=1)
Filter: (spec_uid = 4)
-> Hash (cost=1.18..1.18 rows=18 width=30) (actual time=0.032..0.032 rows=18 loops=1)
-> Seq Scan on company (cost=0.00..1.18 rows=18 width=30) (actual time=0.005..0.013 rows=18 loops=1)
Total runtime: 338.389 ms
(31 rows)

Adding a third 'spec_find' join causes the outer joins to be looped.

EXPLAIN ANALYZE SELECT logical.uid FROM logical
LEFT JOIN model ON logical.uid = model.logical_uid
LEFT JOIN company ON model.company_uid = company.uid
LEFT JOIN type ON logical.type::INT = type.uid
JOIN specs spec_find1 ON spec_find1 .spec_uid='8' AND spec_find1 .text LIKE '%' AND spec_find1 .logical_uid=model.logical_uid
JOIN specs spec_find2 ON spec_find2.spec_uid='8' AND spec_find2.text LIKE '%' AND spec_find2.logical_uid=model.logical_uid
JOIN specs spec_find3 ON spec_find3 .spec_uid='8' AND spec_find3 .text LIKE '%' AND spec_find3 .logical_uid=model.logical_uid
LEFT JOIN specs specs_sort ON specs_sort.spec_uid='4' AND specs_sort.logical_uid=logical.uid
ORDER BY specs_sort.num;

QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=5464.51..5464.52 rows=1 width=8) (actual time=41770.603..41772.111 rows=3555 loops=1)
Sort Key: specs_sort.num
Sort Method: quicksort Memory: 203kB
-> Nested Loop Left Join (cost=1035.47..5464.50 rows=1 width=8) (actual time=40.324..41756.946 rows=3555 loops=1)
-> Nested Loop Left Join (cost=1035.47..5464.22 rows=1 width=12) (actual time=40.313..41716.150 rows=3555 loops=1)
-> Nested Loop Left Join (cost=1035.47..5463.93 rows=1 width=21) (actual time=40.272..41639.961 rows=3555 loops=1)
Join Filter: (specs_sort.logical_uid = logical.uid)
-> Nested Loop (cost=1035.47..5072.50 rows=1 width=17) (actual time=32.400..340.250 rows=3555 loops=1)
-> Hash Join (cost=1035.47..5064.20 rows=1 width=20) (actual time=32.370..251.601 rows=3555 loops=1)
Hash Cond: (model.logical_uid = spec_find1.logical_uid)
-> Seq Scan on model (cost=0.00..3337.82 rows=184182 width=8) (actual time=0.015..95.970 rows=184182 loops=1)
-> Hash (cost=1035.18..1035.18 rows=23 width=12) (actual time=31.890..31.890 rows=2196 loops=1)
-> Hash Join (cost=679.68..1035.18 rows=23 width=12) (actual time=17.178..30.063 rows=2196 loops=1)
Hash Cond: (spec_find1.logical_uid = spec_find2.logical_uid)
-> Hash Join (cost=339.84..691.60 rows=156 width=8) (actual time=8.568..18.308 rows=2196 loops=1)
Hash Cond: (spec_find1.logical_uid = spec_find3.logical_uid)
-> Seq Scan on specs spec_find1 (cost=0.00..326.89 rows=1036 width=4) (actual time=0.015..6.603 rows=2196 loops=1)
Filter: (((text)::text ~~ '%'::text) AND (spec_uid = 8))
-> Hash (cost=326.89..326.89 rows=1036 width=4) (actual time=8.513..8.513 rows=2196 loops=1)
-> Seq Scan on specs spec_find3 (cost=0.00..326.89 rows=1036 width=4) (actual time=0.011..6.851 rows=2196 loops=1)
Filter: (((text)::text ~~ '%'::text) AND (spec_uid = 8))
-> Hash (cost=326.89..326.89 rows=1036 width=4) (actual time=8.588..8.588 rows=2196 loops=1)
-> Seq Scan on specs spec_find2 (cost=0.00..326.89 rows=1036 width=4) (actual time=0.022..6.939 rows=2196 loops=1)
Filter: (((text)::text ~~ '%'::text) AND (spec_uid = 8))
-> Index Scan using logical_pkey on logical (cost=0.00..8.28 rows=1 width=13) (actual time=0.015..0.019 rows=1 loops=3555)
Index Cond: (logical.uid = model.logical_uid)
-> Seq Scan on specs specs_sort (cost=0.00..287.57 rows=8309 width=8) (actual time=0.009..7.996 rows=8172 loops=3555)
Filter: (specs_sort.spec_uid = 4)
-> Index Scan using type_pkey on type (cost=0.00..0.27 rows=1 width=4) (actual time=0.010..0.011 rows=1 loops=3555)
Index Cond: ((logical.type)::integer = type.uid)
-> Index Scan using company_pkey on company (cost=0.00..0.27 rows=1 width=4) (actual time=0.006..0.007 rows=1 loops=3555)
Index Cond: (model.company_uid = company.uid)
Total runtime: 41773.972 ms
(33 rows)

--
Ian Smith
www.ian.org

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2008-08-22 00:49:17 Re: Why do my hash joins turn to nested loops?
Previous Message Joshua Drake 2008-08-21 22:15:44 Re: The state of PG replication in 2008/Q2?