REINDEX rebuilds an index based on the data stored in the index's table, replacing the old copy of the index. There are two main reasons to use REINDEX:
An index has become corrupted, and no longer contains valid data. Although in theory this should never happen, in practice indexes may become corrupted due to software bugs or hardware failures. REINDEX provides a recovery method.
The index in question contains a lot of dead index pages that are not being reclaimed. This can occur with B-tree indexes in PostgreSQL under certain access patterns. REINDEX provides a way to reduce the space consumption of the index by writing a new version of the index without the dead pages. See Section 21.2 for more information.
Recreate all system indexes of a specified database. Indexes on user tables are not processed. Also, indexes on shared system catalogs are skipped except in stand-alone mode (see below).
Recreate all indexes of a specified table. If the table has a secondary "TOAST" table, that is reindexed as well.
Recreate a specified index.
The name of the specific database, table, or index to be reindexed. Table and index names may be schema-qualified. Presently, REINDEX DATABASE can only reindex the current database, so its parameter must match the current database's name.
This is an obsolete option; it is ignored if specified.
If you suspect corruption of an index on a user table, you can simply rebuild that index, or all indexes on the table, using REINDEX INDEX or REINDEX TABLE.
Things are more difficult if you need to recover from corruption of an index on a system table. In this case it's important for the system to not have used any of the suspect indexes itself. (Indeed, in this sort of scenario you may find that server processes are crashing immediately at start-up, due to reliance on the corrupted indexes.) To recover safely, the server must be started with the -P option, which prevents it from using indexes for system catalog lookups.
One way to do this is to shut down the postmaster and start a stand-alone PostgreSQL server with the -P option included on its command line. Then, REINDEX DATABASE, REINDEX TABLE, or REINDEX INDEX can be issued, depending on how much you want to reconstruct. If in doubt, use REINDEX DATABASE to select reconstruction of all system indexes in the database. Then quit the standalone server session and restart the regular server. See the postgres reference page for more information about how to interact with the stand-alone server interface.
Alternatively, a regular server session can be started with -P included in its command line options. The method for doing this varies across clients, but in all libpq-based clients, it is possible to set the PGOPTIONS environment variable to -P before starting the client. Note that while this method does not require locking out other clients, it may still be wise to prevent other users from connecting to the damaged database until repairs have been completed.
If corruption is suspected in the indexes of any of the shared system catalogs (pg_database, pg_group, pg_shadow, or pg_tablespace), then a standalone server must be used to repair it. REINDEX will not process shared catalogs in multiuser mode.
For all indexes except the shared system catalogs, REINDEX is crash-safe and transaction-safe. REINDEX is not crash-safe for shared indexes, which is why this case is disallowed during normal operation. If a failure occurs while reindexing one of these catalogs in standalone mode, it will not be possible to restart the regular server until the problem is rectified. (The typical symptom of a partially rebuilt shared index is "index is not a btree" errors.)
REINDEX is similar to a drop and recreate of the index in that the index contents are rebuilt from scratch. However, the locking considerations are rather different. REINDEX locks out writes but not reads of the index's parent table. It also takes an exclusive lock on the specific index being processed, which will block reads that attempt to use that index. In contrast, DROP INDEX momentarily takes exclusive lock on the parent table, blocking both writes and reads. The subsequent CREATE INDEX locks out writes but not reads; since the index is not there, no read will attempt to use it, meaning that there will be no blocking but reads may be forced into expensive sequential scans. Another important point is that the drop/create approach invalidates any cached query plans that use the index, while REINDEX does not.
Prior to PostgreSQL 7.4, REINDEX TABLE did not automatically process TOAST tables, and so those had to be reindexed by separate commands. This is still possible, but redundant.
Recreate the indexes on the table my_table:
REINDEX TABLE my_table;
Rebuild a single index:
REINDEX INDEX my_index;
Rebuild all system indexes in a particular database, without trusting them to be valid already:
$ export PGOPTIONS="-P" $ psql broken_db ... broken_db=> REINDEX DATABASE broken_db; broken_db=> \q