Inoptimal query plan for max() and multicolumn index

From: Vladimir Kulev <me(at)lightoze(dot)net>
To: pgsql-performance(at)postgresql(dot)org
Subject: Inoptimal query plan for max() and multicolumn index
Date: 2011-06-20 05:35:34
Message-ID: BANLkTimsNMhnAHk6Wz7gARfdkTmz2QKJ7g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi all!
Please, just look at these query explanations and try to explain why
planner does so (PostgreSQL 8.4).
There is an index on table sms (number, timestamp).

And three fast & simple queries:
=# explain analyze select max(timestamp) from sms where number='5502712';
-------------------------------------------------------------------------------------------------------------------------------------------------------
Result (cost=3.79..3.80 rows=1 width=0) (actual time=0.269..0.270
rows=1 loops=1)
InitPlan 1 (returns $0)
-> Limit (cost=0.00..3.79 rows=1 width=8) (actual
time=0.259..0.260 rows=1 loops=1)
-> Index Scan Backward using sms_number_timestamp on sms
(cost=0.00..5981.98 rows=1579 width=8) (actual time=0.253..0.253
rows=1 loops=1)
Index Cond: ((number)::text = '5502712'::text)
Filter: ("timestamp" IS NOT NULL)
Total runtime: 0.342 ms

=# explain analyze select max(timestamp) from sms where number='5802693';
-------------------------------------------------------------------------------------------------------------------------------------------------------
Result (cost=3.79..3.80 rows=1 width=0) (actual time=0.425..0.426
rows=1 loops=1)
InitPlan 1 (returns $0)
-> Limit (cost=0.00..3.79 rows=1 width=8) (actual
time=0.413..0.414 rows=1 loops=1)
-> Index Scan Backward using sms_number_timestamp on sms
(cost=0.00..5981.98 rows=1579 width=8) (actual time=0.409..0.409
rows=1 loops=1)
Index Cond: ((number)::text = '5802693'::text)
Filter: ("timestamp" IS NOT NULL)
Total runtime: 0.513 ms

=# explain analyze select max(timestamp) from sms where number='5802693';
-------------------------------------------------------------------------------------------------------------------------------------------------------
Result (cost=3.79..3.80 rows=1 width=0) (actual time=0.425..0.426
rows=1 loops=1)
InitPlan 1 (returns $0)
-> Limit (cost=0.00..3.79 rows=1 width=8) (actual
time=0.413..0.414 rows=1 loops=1)
-> Index Scan Backward using sms_number_timestamp on sms
(cost=0.00..5981.98 rows=1579 width=8) (actual time=0.409..0.409
rows=1 loops=1)
Index Cond: ((number)::text = '5802693'::text)
Filter: ("timestamp" IS NOT NULL)
Total runtime: 0.513 ms

But this does not work:
# explain analyze select max(timestamp) from sms where number in
('5502712','5802693','5801981');
------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=15912.30..15912.31 rows=1 width=8) (actual
time=587.952..587.954 rows=1 loops=1)
-> Bitmap Heap Scan on sms (cost=1413.02..15758.71 rows=61432
width=8) (actual time=34.266..491.853 rows=59078 loops=1)
Recheck Cond: ((number)::text = ANY
('{5502712,5802693,5801981}'::text[]))
-> Bitmap Index Scan on sms_number_timestamp
(cost=0.00..1397.67 rows=61432 width=0) (actual time=30.778..30.778
rows=59078 loops=1)
Index Cond: ((number)::text = ANY
('{5502712,5802693,5801981}'::text[]))
Total runtime: 588.199 ms

And this too:
# explain analyze select max(timestamp) from sms where
number='5502712' or number='5802693' or number='5801981';
------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=16205.75..16205.76 rows=1 width=8) (actual
time=851.204..851.205 rows=1 loops=1)
-> Bitmap Heap Scan on sms (cost=1473.31..16052.17 rows=61432
width=8) (actual time=68.233..745.004 rows=59090 loops=1)
Recheck Cond: (((number)::text = '5502712'::text) OR
((number)::text = '5802693'::text) OR ((number)::text =
'5801981'::text))
-> BitmapOr (cost=1473.31..1473.31 rows=61592 width=0)
(actual time=64.992..64.992 rows=0 loops=1)
-> Bitmap Index Scan on sms_number_timestamp
(cost=0.00..40.27 rows=1579 width=0) (actual time=0.588..0.588 rows=59
loops=1)
Index Cond: ((number)::text = '5502712'::text)
-> Bitmap Index Scan on sms_number_timestamp
(cost=0.00..40.27 rows=1579 width=0) (actual time=0.266..0.266 rows=59
loops=1)
Index Cond: ((number)::text = '5802693'::text)
-> Bitmap Index Scan on sms_number_timestamp
(cost=0.00..1346.69 rows=58434 width=0) (actual time=64.129..64.129
rows=58972 loops=1)
Index Cond: ((number)::text = '5801981'::text)
Total runtime: 853.176 ms

According to planner cost estimations - it has enough data to
understand that it is better to aggregate maximum from three
subqueries. I suppose it's not a bug but not implemented feature -
maybe there is already something about it on roadmap?

--
Vladimir Kulev
Mobile: +7 (921) 555-44-22

Jabber: me(at)lightoze(dot)net

Skype: lightoze

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Julius Tuskenis 2011-06-20 06:36:32 Re: generating a large XML document
Previous Message Jose Ildefonso Camargo Tolosa 2011-06-20 02:19:24 Re: Large rows number, and large objects