Skip site navigation (1) Skip section navigation (2)

Re: How often do I need to reindex tables?

From: Bill Moran <wmoran(at)collaborativefusion(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: How often do I need to reindex tables?
Date: 2007-02-28 22:35:20
Message-ID: 20070228173520.e059770b.wmoran@collaborativefusion.com (view raw or flat)
Thread:
Lists: pgsql-general
In response to Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:

> Bill Moran <wmoran(at)collaborativefusion(dot)com> writes:
> > In response to Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
> >> Can you describe the usage pattern of that index?  I'm curious why it
> >> doesn't maintain reasonably static size.  How often is the underlying
> >> table vacuumed?
> > ...
> > There are 21 jobs, each ranging in size from 2000 - 5000 files.  Each job
> > runs twice a day.  So you're looking at about 60,000 new rows at midnight
> > and 60,000 new rows at noon each day.  With the purge cycle, about the
> > same number of rows are being deleted as are being added, so the table
> > size stays pretty constant.
> > ...
> > Note that the index under discussion is the only one in this database that
> > shows significant bloat.
> 
> Yeah, and there's no obvious reason in what you say why this one should
> bloat either.  Can you say anything about the distribution of the index
> columns --- are you working with a fairly static set of filenameids, or
> does that change over time?  How about the pathids?  How does the
> combination of filenameid x pathid behave?
> 
> A bit of quick arithmetic says that the minimum possible size of that
> index (at 100% fill factor) would be about 20K pages.  What you showed
> us was that it had expanded to 40-some K pages, or a bit under 50% fill
> factor.  This is low but not totally out of line; the traditional rule
> of thumb is that the steady state fill factor will be about 2/3rds for a
> heavily updated btree.  If you leave it go, does it continue to get
> larger, or stay around 40K?

Just an FYI ... I remembered what prompted the cron job.

We were seeing significant performance degradation.  I never did actual
measurements, but it was on the order of "Bill, why is restoring taking
such a long time?" from other systems people.  At the time, I poked around
and tried some stuff here and there and found that reindex restored
performance.  I didn't look at actual size at that time.

Anyway, I'll report back in a few weeks as to what the numbers look like.

-- 
Bill Moran
Collaborative Fusion Inc.

In response to

Responses

pgsql-general by date

Next:From: Dino VlietDate: 2007-02-28 22:40:55
Subject: why can't I increase shared buffers to higher value?
Previous:From: Scott MarloweDate: 2007-02-28 22:23:00
Subject: Re: Select retrieval slowdown after db drop/reload. Suggestions?

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group