Re: Question on REINDEX

From: "Tambet Matiisen" <t(dot)matiisen(at)aprote(dot)ee>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, <josh(at)agliodbs(dot)com>
Cc: "Bill Chandler" <billybobc1210(at)yahoo(dot)com>, "pgsql-perform" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Question on REINDEX
Date: 2005-04-19 08:33:06
Message-ID: A66A11DBF5525341AEF6B8DE39CDE77008806E@black.aprote.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance


>
> Josh Berkus <josh(at)agliodbs(dot)com> writes:
> >> 1) When is it necessary to run REINDEX or drop/create
> >> an index? All I could really find in the docs is:
>
> > If you need to VACUUM FULL, you need to REINDEX as well.
> For example,
> > if you drop millions of rows from a table.
>
> That's probably a pretty good rule of thumb. It's worth
> noting that VACUUM FULL tends to actively bloat indexes, not
> reduce them in size, because it has to create new index
> entries for the rows it moves before it can delete the old
> ones. So if a VACUUM FULL moves many rows you are likely to
> see the indexes get bigger not smaller.
>

Is my current understanding correct:

1) VACUUM defragments each page locally - moves free space to the end of
page.

2) VACUUM FULL defragments table globally - tries to fill up all
partially free pages and deletes all resulting empty pages.

3) Both VACUUM and VACUUM FULL do only local defragment for indexes.

4) If you want indexes to become fully defragmented, you need to
REINDEX.

If you happen to use triggers for denormalization, like I do, then you
have a lot of updates, which means that tables and indexes become quicky
cluttered with pages, which contain mostly dead tuples. If those tables
and indexes fill up shared buffers, then PostgreSQL slows down, because
it has to do a lot more IO than normal. Regular VACUUM FULL helped, but
I needed REINDEX as well, otherwise indexes grew bigger than tables
itself!

> > Better to up your max_fsm_pages and do regular VACUUMs regularly and
> > frequently so that you don't have to REINDEX at all.
>
> Yes, definitely. Also consider using CLUSTER rather than
> VACUUM FULL when you need to clean up after massive deletions
> from a table. It's not any less intrusive in terms of
> locking, but it's often faster and it avoids the index bloat
> problem (since it effectively does a REINDEX).
>

Hmm, thanks for a tip. BTW, is output of

select count(1), sum(relpages) from pg_class where relkind in
('r','i','t')

good estimate for max_fsm_relations and max_fsm_pages?
Are these parameters used only during VACUUM or in runtime too?

Tambet

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Dave Held 2005-04-19 13:34:19 Re: How to improve db performance with $7K?
Previous Message PriceComparison.com 2005-04-19 05:21:06 Re: postgresql faster in Linux than FreeBSD?