BUG #6335: Weird planner decision with exists (a join b) condition

From: maxim(dot)boguk(at)gmail(dot)com
To: pgsql-bugs(at)postgresql(dot)org
Subject: BUG #6335: Weird planner decision with exists (a join b) condition
Date: 2011-12-14 11:09:38
Message-ID: E1RamiQ-0001jT-V9@wrigleys.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

The following bug has been logged on the website:

Bug reference: 6335
Logged by: Maksym Boguk
Email address: maxim(dot)boguk(at)gmail(dot)com
PostgreSQL version: 9.0.4
Operating system: Linux Ubuntu
Description:

I was explored reasons of high DB load and I localized the next problem
query:

That is correct version:

EXPLAIN ANALYZE select *
from applicant_adv_subscription aas
where
aas.user_id in (5112699)
and exists (
SELECT * from resume
join resume_view_history using (resume_id)
where
resume.user_id = aas.user_id
);


QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop Semi Join (cost=0.00..20.51 rows=1 width=65) (actual
time=0.031..0.032 rows=1 loops=1)
-> Index Scan using applicant_adv_subscription_user_id_key on
applicant_adv_subscription aas (cost=0.00..0.02 rows=1 width=65) (actual
time=0.011..0.012 rows=1 loops=1)
Index Cond: (user_id = 5112699)
-> Nested Loop (cost=0.00..20.49 rows=3118 width=4) (actual
time=0.018..0.018 rows=1 loops=1)
-> Index Scan using resume_user_id_key on resume (cost=0.00..0.13
rows=18 width=8) (actual time=0.008..0.008 rows=1 loops=1)
Index Cond: (user_id = 5112699)
-> Index Scan using resume_view_history_fk73b63ccd36b06a5 on
resume_view_history (cost=0.00..0.95 rows=173 width=4) (actual
time=0.009..0.009 rows=1 loops=1)
Index Cond: (resume_view_history.resume_id =
resume.resume_id)
Total runtime: 0.080 ms

But once I add second value into IN list plan become completely screwed:

EXPLAIN analyze select *
from applicant_adv_subscription aas
where
aas.user_id in (5112699,7995496)
and exists (
SELECT * from resume
join resume_view_history using (resume_id)
where
resume.user_id = aas.user_id
);

QUERY
PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop Semi Join (cost=38967.39..735076.65 rows=2 width=65) (actual
time=14656.388..389866.211 rows=1 loops=1)
Join Filter: (aas.user_id = resume.user_id)
-> Bitmap Heap Scan on applicant_adv_subscription aas (cost=0.02..0.04
rows=2 width=65) (actual time=0.018..0.021 rows=2 loops=1)
Recheck Cond: (user_id = ANY ('{5112699,7995496}'::integer[]))
-> Bitmap Index Scan on applicant_adv_subscription_user_id_key
(cost=0.00..0.02 rows=2 width=0) (actual time=0.014..0.014 rows=2 loops=1)
Index Cond: (user_id = ANY ('{5112699,7995496}'::integer[]))
-> Hash Join (cost=38967.36..726839.23 rows=272203680 width=4) (actual
time=13267.456..182842.841 rows=136136926 loops=2)
Hash Cond: (resume_view_history.resume_id = resume.resume_id)
-> Seq Scan on resume_view_history (cost=0.00..282228.92
rows=272203680 width=4) (actual time=0.004..25574.666 rows=136161776
loops=2)
-> Hash (cost=21737.05..21737.05 rows=16110150 width=8) (actual
time=13260.145..13260.145 rows=16114222 loops=2)
Buckets: 2097152 Batches: 2 Memory Usage: 314776kB
-> Seq Scan on resume (cost=0.00..21737.05 rows=16110150
width=8) (actual time=0.005..8839.480 rows=16114222 loops=2)
Total runtime: 389866.374 ms

Ooops.

Changes in *_cost settings have no effect.
Disabling seq_scan/merge_join/hash_join doest not help as well.

Nothing special about tables, all required indexes on the place.

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Alvaro Herrera 2011-12-14 14:00:16 Re: BUG #6335: Weird planner decision with exists (a join b) condition
Previous Message Wilfried.Weiss 2011-12-14 06:36:25 BUG #6334: initdb not working