hashed subplan 5000x slower than two sequential operations

From: Bryce Nesbitt <bryce2(at)obviously(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: hashed subplan 5000x slower than two sequential operations
Date: 2010-12-08 18:53:58
Message-ID: 4CFFD446.6040609@obviously.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Can you help me understand how to optimize the following. There's a
subplan which in this case returns 3 rows,
but it is really expensive:

=========================================================================
explain analyze SELECT contexts.context_key
FROM contexts
JOIN articles
ON (articles.context_key=contexts.context_key)
WHERE (contexts.parent_key = 392210
OR contexts.context_key IN
(SELECT collection_data.context_key
FROM collection_data
WHERE collection_data.collection_context_key = 392210)
)
AND articles.indexed
;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Hash Join (cost=83054.41..443755.45 rows=261077 width=4) (actual
time=4362.143..6002.808 rows=28 loops=1)
Hash Cond: (articles.context_key = contexts.context_key)
-> Seq Scan on articles (cost=0.00..345661.91 rows=522136 width=4)
(actual time=0.558..3953.002 rows=517356 loops=1)
Filter: indexed
-> Hash (cost=69921.25..69921.25 rows=800493 width=4) (actual
time=829.501..829.501 rows=31 loops=1)
-> Seq Scan on contexts (cost=14.31..69921.25 rows=800493
width=4) (actual time=1.641..829.339 rows=31 loops=1)
Filter: ((parent_key = 392210) OR (hashed subplan))
SubPlan
-> Index Scan using collection_data_context_key_index
on collection_data (cost=0.00..14.30 rows=6 width=4) (actual
time=0.018..0.023 rows=3 loops=1)
Index Cond: (collection_context_key = 392210)
Total runtime: 6002.976 ms
(11 rows)

=========================================================================
explain analyze SELECT contexts.context_key
FROM contexts
JOIN articles
ON (articles.context_key=contexts.context_key)
WHERE (contexts.parent_key = 392210 OR contexts.parent_key IN
(392210,392210,395073,1304250))
AND articles.indexed
;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=14.35..1863.85 rows=94 width=4) (actual
time=0.098..1.038 rows=57 loops=1)
-> Bitmap Heap Scan on contexts (cost=14.35..572.57 rows=288
width=4) (actual time=0.079..0.274 rows=59 loops=1)
Recheck Cond: ((parent_key = 392210) OR (parent_key = ANY
('{392210,392210,395073,1304250}'::integer[])))
-> BitmapOr (cost=14.35..14.35 rows=288 width=0) (actual
time=0.066..0.066 rows=0 loops=1)
-> Bitmap Index Scan on parent_key_idx
(cost=0.00..3.07 rows=58 width=0) (actual time=0.028..0.028 rows=28 loops=1)
Index Cond: (parent_key = 392210)
-> Bitmap Index Scan on parent_key_idx
(cost=0.00..11.13 rows=231 width=0) (actual time=0.035..0.035 rows=87
loops=1)
Index Cond: (parent_key = ANY
('{392210,392210,395073,1304250}'::integer[]))
-> Index Scan using article_key_idx on articles (cost=0.00..4.47
rows=1 width=4) (actual time=0.007..0.008 rows=1 loops=59)
Index Cond: (articles.context_key = contexts.context_key)
Filter: articles.indexed
Total runtime: 1.166 ms
(12 rows)

production=> explain analyze SELECT contexts.context_key
FROM contexts
JOIN articles
ON (articles.context_key=contexts.context_key)
WHERE (contexts.parent_key = 392210
OR contexts.context_key IN
(SELECT collection_data.context_key
FROM collection_data
WHERE collection_data.collection_context_key = 392210)
)
AND articles.indexed
;

=========================================================================
# select version();
PostgreSQL 8.3.4 on x86_64-pc-linux-gnu, compiled by GCC cc (GCC) 4.1.2
20061115 (prerelease) (Debian 4.1.1-21)

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Simon Riggs 2010-12-08 19:00:52 Re: Group commit and commit delay/siblings
Previous Message salima 2010-12-08 16:53:21 Tunning Postgres