From: | Vladimir Kulev <me(at)lightoze(dot)net> |
---|---|
To: | pgsql-performance(at)postgresql(dot)org |
Subject: | Inoptimal query plan for max() and multicolumn index |
Date: | 2011-06-20 05:35:34 |
Message-ID: | BANLkTimsNMhnAHk6Wz7gARfdkTmz2QKJ7g@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Hi all!
Please, just look at these query explanations and try to explain why
planner does so (PostgreSQL 8.4).
There is an index on table sms (number, timestamp).
And three fast & simple queries:
=# explain analyze select max(timestamp) from sms where number='5502712';
-------------------------------------------------------------------------------------------------------------------------------------------------------
Result (cost=3.79..3.80 rows=1 width=0) (actual time=0.269..0.270
rows=1 loops=1)
InitPlan 1 (returns $0)
-> Limit (cost=0.00..3.79 rows=1 width=8) (actual
time=0.259..0.260 rows=1 loops=1)
-> Index Scan Backward using sms_number_timestamp on sms
(cost=0.00..5981.98 rows=1579 width=8) (actual time=0.253..0.253
rows=1 loops=1)
Index Cond: ((number)::text = '5502712'::text)
Filter: ("timestamp" IS NOT NULL)
Total runtime: 0.342 ms
=# explain analyze select max(timestamp) from sms where number='5802693';
-------------------------------------------------------------------------------------------------------------------------------------------------------
Result (cost=3.79..3.80 rows=1 width=0) (actual time=0.425..0.426
rows=1 loops=1)
InitPlan 1 (returns $0)
-> Limit (cost=0.00..3.79 rows=1 width=8) (actual
time=0.413..0.414 rows=1 loops=1)
-> Index Scan Backward using sms_number_timestamp on sms
(cost=0.00..5981.98 rows=1579 width=8) (actual time=0.409..0.409
rows=1 loops=1)
Index Cond: ((number)::text = '5802693'::text)
Filter: ("timestamp" IS NOT NULL)
Total runtime: 0.513 ms
=# explain analyze select max(timestamp) from sms where number='5802693';
-------------------------------------------------------------------------------------------------------------------------------------------------------
Result (cost=3.79..3.80 rows=1 width=0) (actual time=0.425..0.426
rows=1 loops=1)
InitPlan 1 (returns $0)
-> Limit (cost=0.00..3.79 rows=1 width=8) (actual
time=0.413..0.414 rows=1 loops=1)
-> Index Scan Backward using sms_number_timestamp on sms
(cost=0.00..5981.98 rows=1579 width=8) (actual time=0.409..0.409
rows=1 loops=1)
Index Cond: ((number)::text = '5802693'::text)
Filter: ("timestamp" IS NOT NULL)
Total runtime: 0.513 ms
But this does not work:
# explain analyze select max(timestamp) from sms where number in
('5502712','5802693','5801981');
------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=15912.30..15912.31 rows=1 width=8) (actual
time=587.952..587.954 rows=1 loops=1)
-> Bitmap Heap Scan on sms (cost=1413.02..15758.71 rows=61432
width=8) (actual time=34.266..491.853 rows=59078 loops=1)
Recheck Cond: ((number)::text = ANY
('{5502712,5802693,5801981}'::text[]))
-> Bitmap Index Scan on sms_number_timestamp
(cost=0.00..1397.67 rows=61432 width=0) (actual time=30.778..30.778
rows=59078 loops=1)
Index Cond: ((number)::text = ANY
('{5502712,5802693,5801981}'::text[]))
Total runtime: 588.199 ms
And this too:
# explain analyze select max(timestamp) from sms where
number='5502712' or number='5802693' or number='5801981';
------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=16205.75..16205.76 rows=1 width=8) (actual
time=851.204..851.205 rows=1 loops=1)
-> Bitmap Heap Scan on sms (cost=1473.31..16052.17 rows=61432
width=8) (actual time=68.233..745.004 rows=59090 loops=1)
Recheck Cond: (((number)::text = '5502712'::text) OR
((number)::text = '5802693'::text) OR ((number)::text =
'5801981'::text))
-> BitmapOr (cost=1473.31..1473.31 rows=61592 width=0)
(actual time=64.992..64.992 rows=0 loops=1)
-> Bitmap Index Scan on sms_number_timestamp
(cost=0.00..40.27 rows=1579 width=0) (actual time=0.588..0.588 rows=59
loops=1)
Index Cond: ((number)::text = '5502712'::text)
-> Bitmap Index Scan on sms_number_timestamp
(cost=0.00..40.27 rows=1579 width=0) (actual time=0.266..0.266 rows=59
loops=1)
Index Cond: ((number)::text = '5802693'::text)
-> Bitmap Index Scan on sms_number_timestamp
(cost=0.00..1346.69 rows=58434 width=0) (actual time=64.129..64.129
rows=58972 loops=1)
Index Cond: ((number)::text = '5801981'::text)
Total runtime: 853.176 ms
According to planner cost estimations - it has enough data to
understand that it is better to aggregate maximum from three
subqueries. I suppose it's not a bug but not implemented feature -
maybe there is already something about it on roadmap?
--
Vladimir Kulev
Mobile: +7 (921) 555-44-22
Jabber: me(at)lightoze(dot)net
Skype: lightoze
From | Date | Subject | |
---|---|---|---|
Next Message | Julius Tuskenis | 2011-06-20 06:36:32 | Re: generating a large XML document |
Previous Message | Jose Ildefonso Camargo Tolosa | 2011-06-20 02:19:24 | Re: Large rows number, and large objects |