Re: When/if to Reindex

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Bill Moran <wmoran(at)collaborativefusion(dot)com>
Cc: "Steven Flatt" <steven(dot)flatt(at)gmail(dot)com>, "Vivek Khera" <vivek(at)khera(dot)org>, "Pgsql performance" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: When/if to Reindex
Date: 2007-08-09 01:51:48
Message-ID: 11801.1186624308@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Bill Moran <wmoran(at)collaborativefusion(dot)com> writes:
> In response to "Steven Flatt" <steven(dot)flatt(at)gmail(dot)com>:
>> What's interesting is that an insert-only table can benefit significantly
>> from reindexing after the table is fully loaded.

> I've had similar experience. One thing you didn't mention that I've noticed
> is that VACUUM FULL often bloats indexes. I've made it SOP that
> after application upgrades (which usually includes lots of ALTER TABLES and
> other massive schema and data changes) I VACUUM FULL and REINDEX (in that
> order).

Actually, if that is your intent then the best plan is: drop indexes,
VACUUM FULL, create indexes from scratch. A huge proportion of VACUUM
FULL's time goes into updating the indexes, and that work is basically
wasted if you are going to reindex afterwards.

CLUSTER is a good substitute for V.F. partly because it doesn't try to
update the indexes incrementally, but just does the equivalent of
REINDEX after it's reordered the heap.

I'd make the same remark about Steven's case: if possible, don't create
the indexes at all until you've loaded the table fully.

regards, tom lane

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message justin 2007-08-09 03:34:42 mid 2007 "best bang for the buck" hardware opinions
Previous Message Paul Lambert 2007-08-08 22:13:35 Re: Implementing an regex filter