Skip site navigation (1) Skip section navigation (2)

Indexes not always used after inserts/updates/vacuum analyze

From: "Michael G(dot) Martin" <michael(at)vpmonline(dot)com>
To: pgsql-bugs(at)postgresql(dot)org
Subject: Indexes not always used after inserts/updates/vacuum analyze
Date: 2002-02-28 02:53:45
Message-ID: 3C7D9BB9.4060006@vpmonline.com (view raw or flat)
Thread:
Lists: pgsql-bugs
I recently upgraded to 7.2 from 7.1.  
Prior to 7,2, I was shutting down the database, droping indexes, 
vacuuming analayze, re-building all the indexes on a nightly basis ( all 
automated of course ;) ).  Things ran fine.
After upgrading to 7.2, I replaced all that with a nightly  on-line 
/usr/local/pgsql/bin/vacuumdb -v -z  on the database.  

The problems I am seeing is this.  As new data is added and modified, 
indexes are not being used with some queries, but work fine on others. 
 Even a vacuum full analyze did not fix the problem.  I had to drop and 
re-build the index for the query to use the index. Now, I see the 
problem starting again.  The table has about 20-30 million rows ( 
4-5Gigs in size), so seq scan kills any access.

This email is lengthy, but I wanted to detail this well.

Here is the table definition:

                Table "symbol_data"

       Column       |         Type          | Modifiers 

--------------------+-----------------------+-----------

 symbol_name        | character varying(10) | not null
 date               | date                  | not null
 open               | numeric(15,3)         | 
 high               | numeric(15,3)         | 
 low                | numeric(15,3)         | 
 d_close            | numeric(15,3)         | 
 volume             | numeric(15,0)         | 
 earnings           | numeric(15,3)         | 
 dividend           | numeric(15,3)         | 
 source             | character varying(50) | 
 daily_mp_12a_long  | character(3)          | 
 weekly_mp_12a_long | character(3)          | 
 daily_mp_32a_long  | character(3)          | 
 weekly_mp_32a_long | character(3)          | 
Indexes: symbol_data_date_indx
Unique keys: symbol_data_pkey
------------------------------------------------------
Index "symbol_data_date_indx"
 Column | Type 
--------+------
 date   | date
btree
------------------------------------------------------
    Index "symbol_data_pkey"
   Column    |         Type          
-------------+-----------------------
 symbol_name | character varying(10)
 date        | date
unique btree
----------------------------------------------------------

Here is what I would expect which usually happens:

explain select * from symbol_data where symbol_name='IBM';
Index Scan using symbol_data_pkey on symbol_data  (cost=0.00..512.99 rows=128 width=129)

Here is one that fails:

explain select * from symbol_data where symbol_name='ELTE';
Seq Scan on symbol_data  (cost=0.00..707415.32 rows=438015 width=129)

Now I thought maybe it had something to do with the concatenated primary key, but:
 explain select * from symbol_data where symbol_name='IBM' and date between '1990-01-01' and '2002-01-01';
 Index Scan using symbol_data_pkey on symbol_data  (cost=0.00..402.55 rows=100 width=129)

explain select * from symbol_data where symbol_name='ELTE' and date between '1990-01-01' and '2002-01-01';
Seq Scan on symbol_data  (cost=0.00..810075.06 rows=342903 width=129)

Now, changing the date range will eventually use the index:

explain select * from symbol_data where symbol_name='ELTE' and date between '2002-01-01' and '2002-02-01';
NOTICE:  QUERY PLAN:

Index Scan using symbol_data_pkey on symbol_data  (cost=0.00..10815.42 rows=2706 width=129)




...now I do a vacuum analyze:
 VACUUM verbose ANALYZE symbol_data;
NOTICE:  --Relation symbol_data--
NOTICE:  Index symbol_data_date_indx: Pages 49709; Tuples 20536054: Deleted 4221.
	CPU 4.35s/16.30u sec elapsed 45.33 sec.
NOTICE:  Index symbol_data_pkey: Pages 74029; Tuples 20536054: Deleted 4221.
	CPU 6.44s/15.15u sec elapsed 31.00 sec.
NOTICE:  Removed 4221 tuples in 70 pages.
	CPU 0.00s/0.04u sec elapsed 0.08 sec.
NOTICE:  Pages 450911: Changed 0, Empty 0; Tup 20536054: Vac 4221, Keep 0, UnUsed 1858963.
	Total CPU 49.20s/36.31u sec elapsed 149.00 sec.
NOTICE:  Analyzing symbol_data
VACUUM

explain select * from symbol_data where symbol_name='ELTE' and date between '1990-01-01' and '2002-01-01';
Seq Scan on symbol_data  (cost=0.00..810291.94 rows=292916 width=129)

No change on the name.

A new index:

 create index test on symbol_data (symbol_name);

explain select * from symbol_data where symbol_name='ELTE';
NOTICE:  QUERY PLAN:

Seq Scan on symbol_data  (cost=0.00..707611.68 rows=383340 width=129)

EXPLAIN
vpm=> VACUUM verbose ANALYZE  symbol_data;
NOTICE:  --Relation symbol_data--
NOTICE:  Pages 450911: Changed 0, Empty 0; Tup 20536054: Vac 0, Keep 0, UnUsed 1863184.
	Total CPU 38.99s/4.50u sec elapsed 67.95 sec.
NOTICE:  Analyzing symbol_data
VACUUM
vpm=>  explain select * from symbol_data where symbol_name='ELTE';
NOTICE:  QUERY PLAN:

Seq Scan on symbol_data  (cost=0.00..707611.68 rows=355958 width=129)

Doesnt works.  I think the only way to make this ever work is to drop the indexes, vacuum full, and rebuild.

Any thoughts?  This is a very dynamic table, but I was hoping the online vacuum in 7.2 would fix the problems.

I'm in the process of splitting this table up into smaller pieces which will make life easier anyway, 
but I think there is something going on here.

Thanks,
Michael










Responses

pgsql-bugs by date

Next:From: Tom LaneDate: 2002-02-28 03:57:10
Subject: Re: Indexes not always used after inserts/updates/vacuum analyze
Previous:From: SidneyDate: 2002-02-27 23:06:31
Subject: Segmentation Fault

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group