Re: How often do I need to reindex tables?

From: Bill Moran <wmoran(at)collaborativefusion(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: How often do I need to reindex tables?
Date: 2007-02-28 21:24:09
Message-ID: 20070228162409.bb3ee6f9.wmoran@collaborativefusion.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

In response to Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:

> Bill Moran <wmoran(at)collaborativefusion(dot)com> writes:
> > In response to Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
> >> Can you describe the usage pattern of that index? I'm curious why it
> >> doesn't maintain reasonably static size. How often is the underlying
> >> table vacuumed?
> > ...
> > There are 21 jobs, each ranging in size from 2000 - 5000 files. Each job
> > runs twice a day. So you're looking at about 60,000 new rows at midnight
> > and 60,000 new rows at noon each day. With the purge cycle, about the
> > same number of rows are being deleted as are being added, so the table
> > size stays pretty constant.
> > ...
> > Note that the index under discussion is the only one in this database that
> > shows significant bloat.
>
> Yeah, and there's no obvious reason in what you say why this one should
> bloat either. Can you say anything about the distribution of the index
> columns --- are you working with a fairly static set of filenameids, or
> does that change over time? How about the pathids? How does the
> combination of filenameid x pathid behave?

My expectation would be that the distribution stays fairly constant and
that filenameids and pathids don't get added in any great number. Most
of the servers that are being backed up are not going to see the file
names or paths change very much, just the contents of those files.

> A bit of quick arithmetic says that the minimum possible size of that
> index (at 100% fill factor) would be about 20K pages. What you showed
> us was that it had expanded to 40-some K pages, or a bit under 50% fill
> factor. This is low but not totally out of line; the traditional rule
> of thumb is that the steady state fill factor will be about 2/3rds for a
> heavily updated btree. If you leave it go, does it continue to get
> larger, or stay around 40K?

I don't remember how big it was getting before I added that cron job. I'll
remove the cron job and replace it with one that emails me the page size
of that index every week. I'll let it go for a few weeks and see how
it manages.

--
Bill Moran
Collaborative Fusion Inc.

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Brian A. Seklecki 2007-02-28 21:38:52 Recursive/Wildcard Object Ownership Change
Previous Message Tom Lane 2007-02-28 20:40:58 Re: How often do I need to reindex tables?