This page in other versions: 8.4 / 9.0 / 9.1 / 9.2 / 9.3  |  Development versions: devel / 9.4  |  Unsupported versions: 7.2 / 7.3 / 7.4 / 8.0 / 8.1 / 8.2 / 8.3

Chapter 11. Indexes

Indexes are a common way to enhance database performance. An index allows the database server to find and retrieve specific rows much faster than it could do without an index. But indexes also add overhead to the database system as a whole, so they should be used sensibly.

11.1. Introduction

Suppose we have a table similar to this:

CREATE TABLE test1 (
    id integer,
    content varchar
);

and the application requires a lot of queries of the form

SELECT content FROM test1 WHERE id = constant;

With no advance preparation, the system would have to scan the entire test1 table, row by row, to find all matching entries. If there are a lot of rows in test1 and only a few rows (perhaps only zero or one) that would be returned by such a query, then this is clearly an inefficient method. But if the system has been instructed to maintain an index on the id column, then it can use a more efficient method for locating matching rows. For instance, it might only have to walk a few levels deep into a search tree.

A similar approach is used in most books of non-fiction: terms and concepts that are frequently looked up by readers are collected in an alphabetic index at the end of the book. The interested reader can scan the index relatively quickly and flip to the appropriate page(s), rather than having to read the entire book to find the material of interest. Just as it is the task of the author to anticipate the items that the readers are most likely to look up, it is the task of the database programmer to foresee which indexes would be of advantage.

The following command would be used to create the index on the id column, as discussed:

CREATE INDEX test1_id_index ON test1 (id);

The name test1_id_index can be chosen freely, but you should pick something that enables you to remember later what the index was for.

To remove an index, use the DROP INDEX command. Indexes can be added to and removed from tables at any time.

Once the index is created, no further intervention is required: the system will update the index when the table is modified, and it will use the index in queries when it thinks this would be more efficient than a sequential table scan. But you may have to run the ANALYZE command regularly to update statistics to allow the query planner to make educated decisions. See Chapter 13 for information about how to find out whether an index is used and when and why the planner may choose not to use an index.

Indexes can also benefit UPDATE and DELETE commands with search conditions. Indexes can moreover be used in join queries. Thus, an index defined on a column that is part of a join condition can significantly speed up queries with joins.

When an index is created, the system has to keep it synchronized with the table. This adds overhead to data manipulation operations. Therefore indexes that are non-essential or do not get used at all should be removed. Note that a query or data manipulation command can use at most one index per table.

Comments


Oct. 25, 2004, 2:35 a.m.

I am a programmer for PriceComparison.com.

And I have just solved something quite remarkable and hard to solve. For anyone who has any index using an int8 / bigint field.
You should take time to read this.

The problem start when we noticed that any query with that int8 field is always slow. Finally we did an \'explain\' of the query.
The server never even cared of using the index. It always does a sequential scan. No index scan eventhough the index is clearly there. We tried to reindex, recreate it, vacuum analyze, etc.... nothing works.

Then we accidentaly made it work?!? Strange?
Here is our first query (the one that does not work).

select * from product where item_number=1234

this query will NEVER use the index.

then we tried this:

select * from product where item_number=\'1234\'

Notice the difference is just the single quotes!!! Now it uses the index all the time!

Wow, and this only happens with int8 / bigint.

I hope this will help somebody.

Andrew_@_PriceComparison.com
http://www.PriceComparison.com


Jan. 14, 2005, 3:33 p.m.

Andrew Nurcahya's comment above applies to any SQL plan.

The Postgres server's parser seems to choose a constant's datatype out-of-context. The string [1234] is SQL text is an INTEGER to the planner. A ['1234'] with quotes is text, and gets properly cast to BIGINT to match the left-side of [item_number=1234].

It looks like the server plans BEFORE it promotes constants to the right datatype.

A PREPAREd statment should find the index too. It requires the datatypes of parameters to be declared.


Aug. 28, 2005, 11:09 a.m.

The int8 problem is a known issue. It's not a problem with indexes though. Andrew Droffner is right - the parser seems to treat every integer value without quotes or explicit casts as an int4 and (while performing the query), PostgreSQL seems to cast the TABLE FIELD to int4 - row after row.

However, the 'putting it in quotes' solution MIGHT be (a bit) slower than explicitly casting the number. When I have an int8 field (like an id), I always use the PostgreSQL specific cast operator in the query - like

SELECT bar FROM foo WHERE id=123::int8

This works fine.

Additionally, it's REALLY NOT A GOOD IDEA to mix integer types on fields you might join in a query. I accidentaly did so one time - don't try this at work, kids.

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group