Re: Secondary indexes

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Leandro Fanzone <leandro(at)hasar(dot)com>
Cc: postgreSQL <pgsql-novice(at)postgresql(dot)org>
Subject: Re: Secondary indexes
Date: 2001-02-08 19:22:04
Message-ID: 12555.981660124@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-novice

Leandro Fanzone <leandro(at)hasar(dot)com> writes:
> Hello. I have a table with, say, three fields. The first is the ID
> (integer, unique, primary index). The second is an optional index that
> sometimes can be blank, or even duplicated, a varchar(13). The third
> one is the data I want to retrieve, it has no importance in this
> problem. I need to access sometimes by the ID and sometimes by the
> secondary key, so I made an index using

> CREATE INDEX my_index_name ON MY_TABLE(my_secondary_id);

> When I select using the primary key, obviously uses the index created
> by default. When I select using the secondary key, it EXPLAINs me
> that it would use sequencial scan instead of the index I created, thus
> this search becomes extremely slow. Why the engine would ignore the
> index?

Probably because it thinks the indexscan would not be very selective.
An indexscan that has to visit more than a few percent of the rows in
a table is actually slower than a seqscan, typically, and so the planner
won't choose an indexscan if it thinks a large number of rows will be
scanned.

If the secondary column has a lot of "dummy" values as you imply, it's
important to be sure that the dummy values are NULLs, not any other
randomly chosen value; otherwise the dummies will skew the VACUUM
ANALYZE statistics so that the planner will think the column contains
only a few oft-repeated values. If it thinks that, then it's likely
to avoid indexscans.

If you need more help, please send along the exact output of EXPLAIN
for your problem query, also the EXPLAIN result after doing "SET
enable_seqscan TO OFF", and the results of

select attname,attdisbursion,s.*
from pg_statistic s, pg_attribute a, pg_class c
where starelid = c.oid and attrelid = c.oid and staattnum = attnum
and relname = 'YOURTABLENAMEHERE';

so we can see what statistics the planner is looking at and what its cost
estimates are. (NOTE: these directions assume you are running 7.0.*)

regards, tom lane

In response to

Responses

Browse pgsql-novice by date

  From Date Subject
Next Message Leandro Fanzone 2001-02-08 20:14:45 Re: Secondary indexes
Previous Message Leandro Fanzone 2001-02-08 18:52:14 Secondary indexes