Question on Index usage

From: Michael Monnerie <michael(dot)monnerie(at)is(dot)it-management(dot)at>
To: pgsql-admin(at)postgresql(dot)org
Subject: Question on Index usage
Date: 2009-01-21 07:54:27
Message-ID: 200901210854.28091@zmi.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

http://www.postgresql.org/docs/8.3/interactive/using-explain.html

I tried reading that page, but it's still not clear to me, why the index
dbmail_messages_1 is better than dbmail_messages_7:

\d dbmail_messages
Tabelle »public.dbmail_messages«
Spalte | Typ |
Attribute
----------------+-----------------------+---------------------------------------------------------------
message_idnr | bigint | not null default
nextval('dbmail_message_idnr_seq'::regclass)
mailbox_idnr | bigint |
physmessage_id | bigint |
<snip other colums not important>
Indexe:
»dbmail_messages_pkey« PRIMARY KEY, btree (message_idnr)
»dbmail_messages_1« btree (mailbox_idnr)
»dbmail_messages_2« btree (physmessage_id)
»dbmail_messages_3« btree (seen_flag)
»dbmail_messages_4« btree (unique_id)
»dbmail_messages_5« btree (status)
»dbmail_messages_6« btree (status) WHERE status < 2::smallint
»dbmail_messages_7« btree (mailbox_idnr, status, seen_flag) CLUSTER
»dbmail_messages_8« btree (mailbox_idnr, status, recent_flag)

Then I do a SELECT that joins this with another table:

EXPLAIN ANALYZE SELECT 1 FROM dbmail_messages msg JOIN
dbmail_physmessage pm ON ( pm.id = msg.physmessage_id ) WHERE
message_idnr BETWEEN 3178782 AND 3616157 AND mailbox_idnr = 3236 AND
status IN (0,1,2) ORDER BY message_idnr ASC;
QUERY
PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=8593.80..8605.30 rows=4599 width=8) (actual
time=707.878..713.738 rows=5228 loops=1)
Sort Key: msg.message_idnr
Sort Method: quicksort Memory: 438kB
-> Hash Join (cost=6891.33..8314.02 rows=4599 width=8) (actual
time=667.133..697.308 rows=5228 loops=1)
Hash Cond: (msg.physmessage_id = pm.id)
-> Index Scan using dbmail_messages_1 on dbmail_messages msg
(cost=0.00..1324.96 rows=4599 width=16) (actual time=0.094..10.958
rows=5228 loops=1)
Index Cond: (mailbox_idnr = 3236)
Filter: ((message_idnr >= 3178782) AND (message_idnr <=
3616157) AND (status = ANY ('{0,1,2}'::integer[])))
-> Hash (cost=4008.37..4008.37 rows=230637 width=8) (actual
time=666.628..666.628 rows=229809 loops=1)
-> Seq Scan on dbmail_physmessage pm
(cost=0.00..4008.37 rows=230637 width=8) (actual time=0.012..303.139
rows=229809 loops=1)
Total runtime: 729.972 ms

Good, and now I
DROP INDEX dbmail_messages_1;
because anyway there are the _7 and _8 indices which both have
mailbox_idnr as their first column, so sorting is the same as in the _1
index. And _7 is even used by CLUSTER. But that makes
QUERY
PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=10601.66..10613.20 rows=4615 width=8) (actual
time=759.415..765.292 rows=5228 loops=1)
Sort Key: msg.message_idnr
Sort Method: quicksort Memory: 438kB
-> Hash Join (cost=7036.45..10320.79 rows=4615 width=8) (actual
time=720.352..748.912 rows=5228 loops=1)
Hash Cond: (msg.physmessage_id = pm.id)
-> Bitmap Heap Scan on dbmail_messages msg
(cost=145.12..3330.99 rows=4650 width=16) (actual time=50.689..60.132
rows=5228 loops=1)
Recheck Cond: (mailbox_idnr = 3236)
Filter: ((message_idnr >= 3178782) AND (message_idnr <=
3616157) AND (status = ANY ('{0,1,2}'::integer[])))
-> Bitmap Index Scan on dbmail_messages_7
(cost=0.00..143.96 rows=5288 width=0) (actual time=50.628..50.628
rows=15759 loops=1)
Index Cond: (mailbox_idnr = 3236)
-> Hash (cost=4008.37..4008.37 rows=230637 width=8) (actual
time=669.219..669.219 rows=229809 loops=1)
-> Seq Scan on dbmail_physmessage pm
(cost=0.00..4008.37 rows=230637 width=8) (actual time=0.014..301.793
rows=229809 loops=1)
Total runtime: 781.238 ms

So while with the _1 index an "Index Scan" is used, with the _7 it needs
a "Bitmap Index Scan" plus a "Bitmap Heap Scan". Can somebody explain
why the _1 index cannot be deleted without loosing performance? The
plain "Index Scan" could be used with _7 or _8 anyway.

mfg zmi
--
// Michael Monnerie, Ing.BSc ----- http://it-management.at
// Tel: 0660 / 415 65 31 .network.your.ideas.
// PGP Key: "curl -s http://zmi.at/zmi.asc | gpg --import"
// Fingerprint: AC19 F9D5 36ED CD8A EF38 500E CE14 91F7 1C12 09B4
// Keyserver: wwwkeys.eu.pgp.net Key-ID: 1C1209B4

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message paulo matadr 2009-01-21 13:21:03 bytea size limit?
Previous Message Abdul Rahman 2009-01-21 07:46:25 autovacuum daemon