Indexes not always used after inserts/updates/vacuum analyze

From: "Michael G(dot) Martin" <michael(at)vpmonline(dot)com>
To: pgsql-bugs(at)postgresql(dot)org
Subject: Indexes not always used after inserts/updates/vacuum analyze
Date: 2002-02-28 02:53:45
Message-ID: 3C7D9BB9.4060006@vpmonline.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

I recently upgraded to 7.2 from 7.1.
Prior to 7,2, I was shutting down the database, droping indexes,
vacuuming analayze, re-building all the indexes on a nightly basis ( all
automated of course ;) ). Things ran fine.
After upgrading to 7.2, I replaced all that with a nightly on-line
/usr/local/pgsql/bin/vacuumdb -v -z on the database.

The problems I am seeing is this. As new data is added and modified,
indexes are not being used with some queries, but work fine on others.
Even a vacuum full analyze did not fix the problem. I had to drop and
re-build the index for the query to use the index. Now, I see the
problem starting again. The table has about 20-30 million rows (
4-5Gigs in size), so seq scan kills any access.

This email is lengthy, but I wanted to detail this well.

Here is the table definition:

Table "symbol_data"

Column | Type | Modifiers

--------------------+-----------------------+-----------

symbol_name | character varying(10) | not null
date | date | not null
open | numeric(15,3) |
high | numeric(15,3) |
low | numeric(15,3) |
d_close | numeric(15,3) |
volume | numeric(15,0) |
earnings | numeric(15,3) |
dividend | numeric(15,3) |
source | character varying(50) |
daily_mp_12a_long | character(3) |
weekly_mp_12a_long | character(3) |
daily_mp_32a_long | character(3) |
weekly_mp_32a_long | character(3) |
Indexes: symbol_data_date_indx
Unique keys: symbol_data_pkey
------------------------------------------------------
Index "symbol_data_date_indx"
Column | Type
--------+------
date | date
btree
------------------------------------------------------
Index "symbol_data_pkey"
Column | Type
-------------+-----------------------
symbol_name | character varying(10)
date | date
unique btree
----------------------------------------------------------

Here is what I would expect which usually happens:

explain select * from symbol_data where symbol_name='IBM';
Index Scan using symbol_data_pkey on symbol_data (cost=0.00..512.99 rows=128 width=129)

Here is one that fails:

explain select * from symbol_data where symbol_name='ELTE';
Seq Scan on symbol_data (cost=0.00..707415.32 rows=438015 width=129)

Now I thought maybe it had something to do with the concatenated primary key, but:
explain select * from symbol_data where symbol_name='IBM' and date between '1990-01-01' and '2002-01-01';
Index Scan using symbol_data_pkey on symbol_data (cost=0.00..402.55 rows=100 width=129)

explain select * from symbol_data where symbol_name='ELTE' and date between '1990-01-01' and '2002-01-01';
Seq Scan on symbol_data (cost=0.00..810075.06 rows=342903 width=129)

Now, changing the date range will eventually use the index:

explain select * from symbol_data where symbol_name='ELTE' and date between '2002-01-01' and '2002-02-01';
NOTICE: QUERY PLAN:

Index Scan using symbol_data_pkey on symbol_data (cost=0.00..10815.42 rows=2706 width=129)

...now I do a vacuum analyze:
VACUUM verbose ANALYZE symbol_data;
NOTICE: --Relation symbol_data--
NOTICE: Index symbol_data_date_indx: Pages 49709; Tuples 20536054: Deleted 4221.
CPU 4.35s/16.30u sec elapsed 45.33 sec.
NOTICE: Index symbol_data_pkey: Pages 74029; Tuples 20536054: Deleted 4221.
CPU 6.44s/15.15u sec elapsed 31.00 sec.
NOTICE: Removed 4221 tuples in 70 pages.
CPU 0.00s/0.04u sec elapsed 0.08 sec.
NOTICE: Pages 450911: Changed 0, Empty 0; Tup 20536054: Vac 4221, Keep 0, UnUsed 1858963.
Total CPU 49.20s/36.31u sec elapsed 149.00 sec.
NOTICE: Analyzing symbol_data
VACUUM

explain select * from symbol_data where symbol_name='ELTE' and date between '1990-01-01' and '2002-01-01';
Seq Scan on symbol_data (cost=0.00..810291.94 rows=292916 width=129)

No change on the name.

A new index:

create index test on symbol_data (symbol_name);

explain select * from symbol_data where symbol_name='ELTE';
NOTICE: QUERY PLAN:

Seq Scan on symbol_data (cost=0.00..707611.68 rows=383340 width=129)

EXPLAIN
vpm=> VACUUM verbose ANALYZE symbol_data;
NOTICE: --Relation symbol_data--
NOTICE: Pages 450911: Changed 0, Empty 0; Tup 20536054: Vac 0, Keep 0, UnUsed 1863184.
Total CPU 38.99s/4.50u sec elapsed 67.95 sec.
NOTICE: Analyzing symbol_data
VACUUM
vpm=> explain select * from symbol_data where symbol_name='ELTE';
NOTICE: QUERY PLAN:

Seq Scan on symbol_data (cost=0.00..707611.68 rows=355958 width=129)

Doesnt works. I think the only way to make this ever work is to drop the indexes, vacuum full, and rebuild.

Any thoughts? This is a very dynamic table, but I was hoping the online vacuum in 7.2 would fix the problems.

I'm in the process of splitting this table up into smaller pieces which will make life easier anyway,
but I think there is something going on here.

Thanks,
Michael

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2002-02-28 03:57:10 Re: Indexes not always used after inserts/updates/vacuum analyze
Previous Message Sidney 2002-02-27 23:06:31 Segmentation Fault