Skip site navigation (1) Skip section navigation (2)

Re: When/if to Reindex

From: "Jim C(dot) Nasby" <decibel(at)decibel(dot)org>
To: Steven Flatt <steven(dot)flatt(at)gmail(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: When/if to Reindex
Date: 2007-07-18 18:52:51
Message-ID: 20070718185251.GD39272@nasby.net (view raw or flat)
Thread:
Lists: pgsql-performance
On Wed, Jul 18, 2007 at 01:08:30PM -0400, Steven Flatt wrote:
> We're using Postgres 8.2.4.
> 
> I'm trying to decide whether it's worthwhile to implement a process that
> does periodic reindexing.  In a few ad hoc tests, where I've tried to set up
> data similar to how our application does it, I've noticed decent performance
> increases after doing a reindex as well as the planner being more likely to
> choose an index scan.
> 
> Some background: we make extensive use of partitioned tables.  In fact, I'm
> really only considering reindexing partitions that have "just closed".  In
> our simplest/most general case, we have a table partitioned by a timestamp
> column, each partition 24 hours wide.  The partition will have an index on
> the timestamp column as well as a few other indexes including a primary key
> index (all b-tree).  Is there a programmatic way I can decide, upon the
> "closing" of a partition, which, if any, of these indexes will benefit from
> a reindex?  Can I determine things like average node density, node depth, or
> any other indication as to the quality of an index?  Will pg_class.relpages
> be any help here?

Looking at that stuff will help determine if the index is bloated, or if
it's just bigger than optimal. Once you're done writing to an index, it
might be worth reindexing with a fillfactor of 100% to shrink things
down a bit.

> Is it a simple matter of running some queries, reindexing the table, then
> running the queries again to determine overall performance change?  If so,
> what queries would exercise this best?
> 
> Just trying to determine if the extra cost of reindexing newly closed
> partitions will be worth the performance benefit of querying the data.
> Reindexing a table with a day's worth of data is taking on the order of a
> few hours (10s of millions of rows).
> 
> The docs say that:
> 
> "...for B-tree indexes a freshly-constructed index is somewhat faster to
> access than one that has been updated many times, because logically adjacent
> pages are usually also physically adjacent in a newly built index... It
> might be worthwhile to reindex periodically just to improve access speed."

That's the other consideration, though if you're seeing a big difference
I suspect it's an issue of indexes fitting in cache or not.
-- 
Jim Nasby                                      decibel(at)decibel(dot)org
EnterpriseDB      http://enterprisedb.com      512.569.9461 (cell)

In response to

pgsql-performance by date

Next:From: Thomas FinneidDate: 2007-07-18 19:08:08
Subject: Re: insert vs select into performance
Previous:From: Jim C. NasbyDate: 2007-07-18 18:50:07
Subject: Re: [PERFORM] Parrallel query execution for UNION ALL Queries

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group