Why won't the query planner use my index?

From: "Robert Wille" <rwille(at)iarchives(dot)com>
To: <pgsql-general(at)postgresql(dot)org>
Subject: Why won't the query planner use my index?
Date: 2002-03-28 04:46:49
Message-ID: OE39NZ4mF1IT18olkDT00000355@hotmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

I am performing a scalability test of PostgreSQL and am having a problem getting the planner to use an index. I have created the following table:

create table a (
id int default nextval('seq'),
parent int,
name varchar(32),
state bigint default 0,
scope varchar(80),
primary key (id)
);

I then populate it with 1M rows, and then create the following index:

create unique index parentindex on a (parent, name);

and then vacuum analyze.

The output from the following explain statements seem unusual:

------------------------
test=# explain select * from a where parent=5;
NOTICE: QUERY PLAN:
Seq Scan on a (cost=0.00..23000.20 rows=46108 width=40)
------------------------

I have tried to no avail to get it to use parentindex on select statements involving only parent.

------------------------
test=# explain select * from a where parent=5 and name between '0' and '1';
NOTICE: QUERY PLAN:
Index Scan using parentindex on a (cost=0.00..4.82 rows=1 width=40)

test=# explain select * from a where parent=5 and name between '0' and '4';
NOTICE: QUERY PLAN:
Seq Scan on a (cost=0.00..28243.08 rows=15323 width=40)
------------------------

The first explain yields what I would expect, but the second one does not. How can a sequential scan of the entire table be faster than using the index, or at least using the index for the parent column and then doing a sequential scan of the name column?

------------------------
test=# explain select max(id) from a;
NOTICE: QUERY PLAN:
Aggregate (cost=23000.20..23000.20 rows=1 width=4)
-> Seq Scan on a (cost=0.00..20378.76 rows=1048576 width=4)
------------------------

This one is quite baffling. All the DB needs to do is look at the end of the primary key index.

------------------------
test=# explain select * from a where id < 10000;
NOTICE: QUERY PLAN:
Seq Scan on a (cost=0.00..23000.20 rows=9998 width=40)
------------------------

This select statement would select only 1% of the rows, yet the planner thinks that a sequential scan is faster. If I select few enough rows so that the index is used, this one executes much faster.

Can someone tell me why the query planner makes seemingly poor choices? Is it possible to force an execution plan (or give hints)? I am using PostgreSQL 7.1.3 for Windows (soon to be for Linux).

You can duplicate my tests by running the attached scripts in this order: create, populate, index.

Attachment Content-Type Size
populate-table.sql application/octet-stream 1.1 KB
index-table.sql application/octet-stream 69 bytes
create-table.sql application/octet-stream 175 bytes

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Medi Montaseri 2002-03-28 05:55:08 Re: Test data sets
Previous Message Martijn van Oosterhout 2002-03-28 01:03:57 Re: Performance question.