Two fast queries get slow when combined

From: cluster <skrald(at)amossen(dot)dk>
To: pgsql-performance(at)postgresql(dot)org
Subject: Two fast queries get slow when combined
Date: 2007-10-30 20:48:16
Message-ID: fg85ae$1740$1@news.hub.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

I have two small queries which are both very fast to evaluate
separately. The first query, "Query 1", calculates some statistics and
the the second query, "Query 2", finds a subset of relevant keys.
When combined into a single query which calculates statistics from only
the subset of relevant keys the evaluation plan explodes and uses both
seq scans and bitmap heap scans.
How can I improve the performance of the combined query?

Queries and output from EXPLAIN ANALYZE can be seen here with some
syntax highlighting:
http://rafb.net/p/BJIW4p69.html

I will also paste it here:

=============================================================================
QUERY 1 (very *fast*):
=============================================================================
SELECT keyId, count(1) as num_matches
FROM stats
GROUP BY keyId
LIMIT 50

Limit (cost=0.00..23.65 rows=50 width=8) (actual time=0.090..2.312
rows=50 loops=1)
-> GroupAggregate (cost=0.00..4687.46 rows=9912 width=8) (actual
time=0.085..2.145 rows=50 loops=1)
-> Index Scan using stats_keyId on stats (cost=0.00..3820.19
rows=99116 width=8) (actual time=0.031..1.016 rows=481 loops=1)
Total runtime: 2.451 ms
(4 rows)

=============================================================================
QUERY 2 (very *fast*):
=============================================================================
SELECT keyId, sortNum
FROM items i
WHERE sortNum > 123
ORDER BY sortNum
LIMIT 50

Limit (cost=0.01..9.87 rows=50 width=8) (actual time=0.068..0.610
rows=50 loops=1)
InitPlan
-> Limit (cost=0.00..0.01 rows=1 width=0) (actual
time=0.009..0.025 rows=1 loops=1)
-> Result (cost=0.00..0.01 rows=1 width=0) (actual
time=0.006..0.007 rows=1 loops=1)
-> Index Scan using items_sortNum on items i (cost=0.00..1053.67
rows=5344 width=8) (actual time=0.063..0.455 rows=50 loops=1)
Index Cond: (sortNum >= $0)
Total runtime: 0.749 ms
(7 rows)

=============================================================================
COMBINED QUERY (very *slow*):
=============================================================================
SELECT keyId, sortNum, count(1)
FROM stats s, items i
WHERE s.keyId = i.keyId AND i.sortNum > 123
GROUP BY i.keyId, i.sortNum
ORDER BY i.sortNum
LIMIT 50

Limit (cost=3281.72..3281.84 rows=50 width=16) (actual
time=435.838..436.043 rows=50 loops=1)
InitPlan
-> Limit (cost=0.00..0.01 rows=1 width=0) (actual
time=0.016..0.021 rows=1 loops=1)
-> Result (cost=0.00..0.01 rows=1 width=0) (actual
time=0.012..0.013 rows=1 loops=1)
-> Sort (cost=3281.71..3289.97 rows=3304 width=16) (actual
time=435.833..435.897 rows=50 loops=1)
Sort Key: i.sortNum
-> Hash Join (cost=2745.80..3088.59 rows=3304 width=16)
(actual time=364.247..413.164 rows=8490 loops=1)
Hash Cond: (s.keyId = i.keyId)
-> HashAggregate (cost=2270.53..2394.43 rows=9912
width=8) (actual time=337.869..356.533 rows=9911 loops=1)
-> Seq Scan on items (cost=0.00..1527.16
rows=99116 width=8) (actual time=0.016..148.118 rows=99116 loops=1)
-> Hash (cost=408.47..408.47 rows=5344 width=12)
(actual time=26.342..26.342 rows=4491 loops=1)
-> Bitmap Heap Scan on items i
(cost=121.67..408.47 rows=5344 width=12) (actual time=5.007..16.898
rows=4491 loops=1)
Recheck Cond: (sortNum >= $0)
-> Bitmap Index Scan on items_sortNum
(cost=0.00..120.33 rows=5344 width=0) (actual time=4.273..4.273
rows=13375 loops=1)
Index Cond: (sortNum >= $0)
Total runtime: 436.421 ms
(16 rows)

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Steven Flatt 2007-10-30 21:00:10 Re: tables with 300+ partitions
Previous Message Guillaume Lelarge 2007-10-30 19:21:05 Re: Optimizing PostgreSQL for Windows