Re: when to reindex?

From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: "Kevin Hunter" <hunteke(at)earlham(dot)edu>
Cc: "Jon Roberts" <Jon(dot)Roberts(at)asurion(dot)com>, "Scott Marlowe" <scott(dot)marlowe(at)gmail(dot)com>, "Postgres General List" <pgsql-general(at)postgresql(dot)org>
Subject: Re: when to reindex?
Date: 2008-06-06 20:42:15
Message-ID: 87hcc6cn2w.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

"Kevin Hunter" <hunteke(at)earlham(dot)edu> writes:

> Or, assuming the REINDEX is for speed/bloat, not for corruption, perhaps
> an option to use the old index as a basis, rather than scanning the
> entire table multiple times as with a CREATE INDEX CONCURRENTLY.

That's been mentioned, it ought to be on the TODO. The trick is determining
*when* to use the index and when to use the table -- but that's something the
planner already does quite well and we could hopefully leverage that.

Note that in typical cases it would be slower. REINDEX scans the table
precisely once and sorts it. The sorting will probably have to do multiple
passes through temporary files which is presumably what you're referring to.
But those passes are still at least sequential. A full index scan has to do
random access reads and in many cases read the same page many times to get
different records.

The cases where it would win would be where you have a lot of dead space in
the table (but not in the index), partial indexes which don't cover much of
the table, or a table which is already very well clustered (possibly,
depending on other factors).

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's PostGIS support!

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Alvaro Herrera 2008-06-06 21:24:51 Re: Application EventLog: could not write to log file: Bad file descriptor
Previous Message Craig Ringer 2008-06-06 20:23:19 Re: when to reindex?