Re: How often do I need to reindex tables?

From: Bill Moran <wmoran(at)collaborativefusion(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: How often do I need to reindex tables?
Date: 2007-04-19 13:33:14
Message-ID: 20070419093314.434c2f08.wmoran@collaborativefusion.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

In response to Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:

> Bill Moran <wmoran(at)collaborativefusion(dot)com> writes:
> > Just an FYI ... I remembered what prompted the cron job.
>
> > We were seeing significant performance degradation. I never did actual
> > measurements, but it was on the order of "Bill, why is restoring taking
> > such a long time?" from other systems people. At the time, I poked around
> > and tried some stuff here and there and found that reindex restored
> > performance. I didn't look at actual size at that time.
>
> A reindex might improve performance for reasons other than bloat --- to
> wit, that a freshly-built index is in perfect physical order, which
> tends to get degraded over time by page splits. How important that is
> depends on your usage patterns. If this is what the story is for your
> situation, then what might fix it (in 8.2) is to create the index with
> FILLFACTOR 50 or so, so that it's already at the steady state density
> and won't need many page splits.
>
> > Anyway, I'll report back in a few weeks as to what the numbers look like.
>
> Yeah, please for the moment just watch what happens with the default
> behavior.

Remember this discussion?

To recap, I had scheduled a weekly reindex of this database because I
was seeing performance issues otherwise. In order to see if this was
actually helping, I disabled the redindex job, ran a few timing
experiments, then scheduled a job to email me the size of the indexes
in the database on a daily basis.

At this point, I have daily records of index size since March 6th.

The behaviour is like this: A freshly created index is about 21,000
pages in size. Under normal usage, the index size balloons to about
38,000 pages immediately after the first backup job is run. From there
it grows slowly (but fairly consistently) by about 100 pages each day.
As of today, it is 44304 pages.

When I first brought up this discussion, the table contained 8068956
rows. It now has 7451381, which means it's dropped by 7%

The important part is that I can't reproduce the performance problems
that I originally thought were the result of this. It's entirely
possible that something else was changed since then that actually
fixed the problem, and that the index bloat was a red herring.

Not sure what (if any) conclusions can be drawn from this. Is there
any other data I should gather? Have I just proved my previous
rantings about the necessity of reindexing to be wrong?

--
Bill Moran
Collaborative Fusion Inc.
http://people.collaborativefusion.com/~wmoran/

wmoran(at)collaborativefusion(dot)com
Phone: 412-422-3463x4023

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Kev 2007-04-19 13:41:54 Re: Incremental backups
Previous Message Dave Page 2007-04-19 13:30:02 Re: Auditing a database