Skip site navigation (1) Skip section navigation (2)

Surprising no use of indexes - low performance

From: Nicolas Charles <nicolas(dot)charles(at)normation(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Surprising no use of indexes - low performance
Date: 2013-02-14 15:35:12
Message-ID: 511D0430.1090603@normation.com (view raw or flat)
Thread:
Lists: pgsql-performance
Hello,

I've been struggling to understand what's happening on my 
databases/query for several days, and I'm turning to higher mind for a 
logical answer.

I'm dealing with a fairly large database, containing logs informations, 
that I crunch to get data out of it, with several indexes on them that I 
hoped were logical

\d ruddersysevents
                                  Table « public.ruddersysevents »
       Colonne       |           Type           |                  
Modificateurs
--------------------+--------------------------+--------------------------------------------------
  id                 | integer                  | non NULL Par défaut, 
nextval('serial'::regclass)
  executiondate      | timestamp with time zone | non NULL
  nodeid             | text                     | non NULL
  directiveid        | text                     | non NULL
  ruleid             | text                     | non NULL
  serial             | integer                  | non NULL
  component          | text                     | non NULL
  keyvalue           | text                     |
  executiontimestamp | timestamp with time zone | non NULL
  eventtype          | character varying(64)    |
  policy             | text                     |
  msg                | text                     |
Index :
     "ruddersysevents_pkey" PRIMARY KEY, btree (id)
     "component_idx" btree (component)
     "configurationruleid_idx" btree (ruleid)
     "executiontimestamp_idx" btree (executiontimestamp)
     "keyvalue_idx" btree (keyvalue)
     "nodeid_idx" btree (nodeid)
Contraintes de vérification :
     "ruddersysevents_component_check" CHECK (component <> ''::text)
     "ruddersysevents_configurationruleid_check" CHECK (ruleid <> ''::text)
     "ruddersysevents_nodeid_check" CHECK (nodeid <> ''::text)
     "ruddersysevents_policyinstanceid_check" CHECK (directiveid <> 
''::text)


It contains 11018592 entries, with the followinf patterns :
108492 distinct executiontimestamp
14 distinct nodeid
59 distinct directiveid
26 distinct ruleid
35 distinct serial

Related table/index size are
                 relation                |  size
----------------------------------------+---------
  public.ruddersysevents                 | 3190 MB
  public.nodeid_idx                      | 614 MB
  public.configurationruleid_idx         | 592 MB
  public.ruddersysevents_pkey            | 236 MB
  public.executiontimestamp_idx          | 236 MB


I'm crunching the data by looking for each 
nodeid/ruleid/directiveid/serial with an executiontimestamp in an interval:

explain analyze select executiondate, nodeid, ruleid, directiveid, 
serial, component, keyValue, executionTimeStamp, eventtype, policy, msg 
from RudderSysEvents where 1=1  and nodeId = 
'31264061-5ecb-4891-9aa4-83824178f43d'  and ruleId = 
'61713ff1-aa6f-4c86-b3cb-7012bee707dd' and serial = 10 and 
executiontimestamp between to_timestamp('2012-11-22 16:00:16.005', 
'YYYY-MM-DD HH24:MI:SS.MS') and to_timestamp('2013-01-25 18:53:52.467', 
'YYYY-MM-DD HH24:MI:SS.MS') ORDER BY executionTimeStamp asc;
  Sort  (cost=293125.41..293135.03 rows=3848 width=252) (actual 
time=28628.922..28647.952 rows=62403 loops=1)
    Sort Key: executiontimestamp
    Sort Method:  external merge  Disk: 17480kB
    ->  Bitmap Heap Scan on ruddersysevents  (cost=74359.66..292896.27 
rows=3848 width=252) (actual time=1243.150..28338.927 rows=62403 loops=1)
          Recheck Cond: ((nodeid = 
'31264061-5ecb-4891-9aa4-83824178f43d'::text) AND (ruleid = 
'61713ff1-aa6f-4c86-b3cb-7012bee707dd'::text))
          Filter: ((serial = 10) AND (executiontimestamp >= 
to_timestamp('2012-11-22 16:00:16.005'::text, 'YYYY-MM-DD 
HH24:MI:SS.MS'::text)) AND (executiontimestamp <= 
to_timestamp('2013-01-25 18:53:52.467'::text, 'YYYY-MM-DD 
HH24:MI:SS.MS'::text)))
          ->  BitmapAnd  (cost=74359.66..74359.66 rows=90079 width=0) 
(actual time=1228.610..1228.610 rows=0 loops=1)
                ->  Bitmap Index Scan on nodeid_idx  
(cost=0.00..25795.17 rows=716237 width=0) (actual time=421.365..421.365 
rows=690503 loops=1)
                      Index Cond: (nodeid = 
'31264061-5ecb-4891-9aa4-83824178f43d'::text)
                ->  Bitmap Index Scan on configurationruleid_idx  
(cost=0.00..48562.32 rows=1386538 width=0) (actual time=794.490..794.490 
rows=1381391 loops=1)
                      Index Cond: (ruleid = 
'61713ff1-aa6f-4c86-b3cb-7012bee707dd'::text)
  Total runtime: 28657.352 ms



I'm surprised that the executiontimestamp index is not used, since it 
seems to be where most of the query time is spent.

For all my tests, I removed all the incoming logs, so that this table 
has only selects and no writes

I'm using Postgres 8.4, on a quite smallish VM, with some process 
runnings, with the following non default configuration
shared_buffers = 112MB
work_mem = 8MB
maintenance_work_mem = 48MB
max_stack_depth = 3MB
wal_buffers = 1MB
effective_cache_size = 128MB
checkpoint_segments = 6

Increasing the shared_buffers to 384, 1GB or 1500MB didn't improve the 
performances (less than 10%). I would have expected it to improve, since 
the indexes would all fit in RAM
Every explain analyze made after modification of configuration and 
indexes where done after a complete batch of crunch of logs by our apps, 
to be sure the stats were corrects

I tested further with the indexes, and reached these results :
1/ dropping the unique index  "configurationruleid_idx" btree (ruleid), 
"executiontimestamp_idx" btree (executiontimestamp),  "keyvalue_idx" 
btree (keyvalue),  "nodeid_idx" btree (nodeid) to replace them by a 
unique index did lower the perfs :

create index composite_idx on ruddersysevents (executiontimestamp, 
ruleid, serial, nodeid);

Index Scan using composite_idx on ruddersysevents  (cost=0.01..497350.22 
rows=3729 width=252) (actual time=7.989..83717.901 rows=62403 loops=1)
    Index Cond: ((executiontimestamp >= to_timestamp('2012-11-22 
16:00:16.005'::text, 'YYYY-MM-DD HH24:MI:SS.MS'::text)) AND 
(executiontimestamp <= to_timestamp('2013-01-25 18:53:52.467'::text, 
'YYYY-MM-DD HH24:MI:SS.MS'::text)) AND (ruleid = 
'61713ff1-aa6f-4c86-b3cb-7012bee707dd'::text) AND (serial = 10) AND 
(nodeid = '31264061-5ecb-4891-9aa4-83824178f43d'::text))


2/ Removing nodeid from the index did lower again the perf

create index composite2_idx on ruddersysevents (executiontimestamp, 
ruleid, serial);

  Index Scan using composite2_idx on ruddersysevents  
(cost=0.01..449948.98 rows=3729 width=252) (actual 
time=23.507..84888.349 rows=62403 loops=1)
    Index Cond: ((executiontimestamp >= to_timestamp('2012-11-22 
16:00:16.005'::text, 'YYYY-MM-DD HH24:MI:SS.MS'::text)) AND 
(executiontimestamp <= to_timestamp('2013-01-25 18:53:52.467'::text, 
'YYYY-MM-DD HH24:MI:SS.MS'::text)) AND (ruleid = 
'61713ff1-aa6f-4c86-b3cb-7012bee707dd'::text) AND (serial = 10))
    Filter: (nodeid = '31264061-5ecb-4891-9aa4-83824178f43d'::text)

3/ Removing executiontimestamp from the composite index makes the query 
performs better at the begining of its uses (around 17 secondes), but 
over time it degrades (I'm logging query longer than 20 secondes, and 
there are very rare in the first half of the batch, and getting more and 
more common at the end) to what is below

create index composite3_idx on ruddersysevents (ruleid, serial, nodeid);
   Sort  (cost=24683.44..24693.07 rows=3849 width=252) (actual 
time=60454.558..60474.013 rows=62403 loops=1)
    Sort Key: executiontimestamp
    Sort Method:  external merge  Disk: 17480kB
    ->  Bitmap Heap Scan on ruddersysevents  (cost=450.12..24454.23 
rows=3849 width=252) (actual time=146.065..60249.143 rows=62403 loops=1)
          Recheck Cond: ((ruleid = 
'61713ff1-aa6f-4c86-b3cb-7012bee707dd'::text) AND (serial = 10) AND 
(nodeid = '31264061-5ecb-4891-9aa4-83824178f43d'::text))
          Filter: ((executiontimestamp >= to_timestamp('2012-11-22 
16:00:16.005'::text, 'YYYY-MM-DD HH24:MI:SS.MS'::text)) AND 
(executiontimestamp <= to_timestamp('2013-01-25 18:53:52.467'::text, 
'YYYY-MM-DD HH24:MI:SS.MS'::text)))
          ->  Bitmap Index Scan on composite3_idx  (cost=0.00..449.15 
rows=6635 width=0) (actual time=129.102..129.102 rows=62403 loops=1)
                Index Cond: ((ruleid = 
'61713ff1-aa6f-4c86-b3cb-7012bee707dd'::text) AND (serial = 10) AND 
(nodeid = '31264061-5ecb-4891-9aa4-83824178f43d'::text))
  Total runtime: 60484.022 ms


And with all these tests, the difference of the whole database 
processing with each index time is within 10% error margin (2h10 to 
2h25), even if most of the time is spent on the SQL query.

So my question is :
"Why *not* indexing the column which is not used makes the query slower 
over time, while not slowing the application?"

If you have some clues, I'm really interested.

Best regards
Nicolas



Responses

pgsql-performance by date

Next:From: Dan KoganDate: 2013-02-14 17:27:13
Subject: Re: High CPU usage / load average after upgrading to Ubuntu 12.04
Previous:From: Merlin MoncureDate: 2013-02-14 14:07:48
Subject: Re: High CPU usage / load average after upgrading to Ubuntu 12.04

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group