Skip site navigation (1) Skip section navigation (2)

Question on Index usage

From: Michael Monnerie <michael(dot)monnerie(at)is(dot)it-management(dot)at>
To: pgsql-admin(at)postgresql(dot)org
Subject: Question on Index usage
Date: 2009-01-21 07:54:27
Message-ID: 200901210854.28091@zmi.at (view raw or flat)
Thread:
Lists: pgsql-admin
http://www.postgresql.org/docs/8.3/interactive/using-explain.html

I tried reading that page, but it's still not clear to me, why the index 
dbmail_messages_1 is better than dbmail_messages_7:

\d dbmail_messages
                                    Tabelle »public.dbmail_messages«
     Spalte     |          Typ          |                           
Attribute
----------------+-----------------------+---------------------------------------------------------------
 message_idnr   | bigint                | not null default 
nextval('dbmail_message_idnr_seq'::regclass)
 mailbox_idnr   | bigint                |
 physmessage_id | bigint                |
<snip other colums not important>
Indexe:
    »dbmail_messages_pkey« PRIMARY KEY, btree (message_idnr)
    »dbmail_messages_1« btree (mailbox_idnr)
    »dbmail_messages_2« btree (physmessage_id)
    »dbmail_messages_3« btree (seen_flag)
    »dbmail_messages_4« btree (unique_id)
    »dbmail_messages_5« btree (status)
    »dbmail_messages_6« btree (status) WHERE status < 2::smallint
    »dbmail_messages_7« btree (mailbox_idnr, status, seen_flag) CLUSTER
    »dbmail_messages_8« btree (mailbox_idnr, status, recent_flag)

Then I do a SELECT that joins this with another table:

EXPLAIN ANALYZE SELECT 1 FROM dbmail_messages msg JOIN 
dbmail_physmessage pm ON ( pm.id = msg.physmessage_id ) WHERE 
message_idnr BETWEEN 3178782 AND 3616157 AND mailbox_idnr = 3236 AND 
status IN (0,1,2) ORDER BY message_idnr ASC;
                                                                          QUERY 
PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=8593.80..8605.30 rows=4599 width=8) (actual 
time=707.878..713.738 rows=5228 loops=1)
   Sort Key: msg.message_idnr
   Sort Method:  quicksort  Memory: 438kB
   ->  Hash Join  (cost=6891.33..8314.02 rows=4599 width=8) (actual 
time=667.133..697.308 rows=5228 loops=1)
         Hash Cond: (msg.physmessage_id = pm.id)
         ->  Index Scan using dbmail_messages_1 on dbmail_messages msg  
(cost=0.00..1324.96 rows=4599 width=16) (actual time=0.094..10.958 
rows=5228 loops=1)
               Index Cond: (mailbox_idnr = 3236)
               Filter: ((message_idnr >= 3178782) AND (message_idnr <= 
3616157) AND (status = ANY ('{0,1,2}'::integer[])))
         ->  Hash  (cost=4008.37..4008.37 rows=230637 width=8) (actual 
time=666.628..666.628 rows=229809 loops=1)
               ->  Seq Scan on dbmail_physmessage pm  
(cost=0.00..4008.37 rows=230637 width=8) (actual time=0.012..303.139 
rows=229809 loops=1)
 Total runtime: 729.972 ms

Good, and now I
DROP INDEX dbmail_messages_1;
because anyway there are the _7 and _8 indices which both have 
mailbox_idnr as their first column, so sorting is the same as in the _1 
index. And _7 is even used by CLUSTER. But that makes
                                                                   QUERY 
PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=10601.66..10613.20 rows=4615 width=8) (actual 
time=759.415..765.292 rows=5228 loops=1)
   Sort Key: msg.message_idnr
   Sort Method:  quicksort  Memory: 438kB
   ->  Hash Join  (cost=7036.45..10320.79 rows=4615 width=8) (actual 
time=720.352..748.912 rows=5228 loops=1)
         Hash Cond: (msg.physmessage_id = pm.id)
         ->  Bitmap Heap Scan on dbmail_messages msg  
(cost=145.12..3330.99 rows=4650 width=16) (actual time=50.689..60.132 
rows=5228 loops=1)
               Recheck Cond: (mailbox_idnr = 3236)
               Filter: ((message_idnr >= 3178782) AND (message_idnr <= 
3616157) AND (status = ANY ('{0,1,2}'::integer[])))
               ->  Bitmap Index Scan on dbmail_messages_7  
(cost=0.00..143.96 rows=5288 width=0) (actual time=50.628..50.628 
rows=15759 loops=1)
                     Index Cond: (mailbox_idnr = 3236)
         ->  Hash  (cost=4008.37..4008.37 rows=230637 width=8) (actual 
time=669.219..669.219 rows=229809 loops=1)
               ->  Seq Scan on dbmail_physmessage pm  
(cost=0.00..4008.37 rows=230637 width=8) (actual time=0.014..301.793 
rows=229809 loops=1)
 Total runtime: 781.238 ms

So while with the _1 index an "Index Scan" is used, with the _7 it needs 
a "Bitmap Index Scan" plus a "Bitmap Heap Scan". Can somebody explain 
why the _1 index cannot be deleted without loosing performance? The 
plain "Index Scan" could be used with _7 or _8 anyway.

mfg zmi
-- 
// Michael Monnerie, Ing.BSc    -----      http://it-management.at
// Tel: 0660 / 415 65 31                      .network.your.ideas.
// PGP Key:         "curl -s http://zmi.at/zmi.asc | gpg --import"
// Fingerprint: AC19 F9D5 36ED CD8A EF38  500E CE14 91F7 1C12 09B4
// Keyserver: wwwkeys.eu.pgp.net                  Key-ID: 1C1209B4


Responses

pgsql-admin by date

Next:From: paulo matadrDate: 2009-01-21 13:21:03
Subject: bytea size limit?
Previous:From: Abdul RahmanDate: 2009-01-21 07:46:25
Subject: autovacuum daemon

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group