Re: How frequently to defrag(cluster)

From: Steve Crawford <scrawford(at)pinpointresearch(dot)com>
To: A J <s5aly(at)yahoo(dot)com>
Cc: PG Admin <pgsql-admin(at)postgresql(dot)org>
Subject: Re: How frequently to defrag(cluster)
Date: 2011-07-20 21:04:59
Message-ID: 4E2742FB.40201@pinpointresearch.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

On 07/20/2011 12:58 PM, A J wrote:
> I understand that 'cluster' performs the role of defrag (along with
> rewriting in index order) in Postgres.
> How frequently does one have to run cluster ? Any thumb-rules or
> experience ? How do I find if my table is fragmented enough to need a
> cluster ?
> We are still to use Postgres in production, just evaluating at this point.
>
> Thanks for any pointers.
As with everything the answer is "it depends". For a "typical" workload
where the rows updated by a single query are one or a few rowsl, the
automatic vacuum process should handle everything for you without
intervention (some tuning of the vacuum settings may be in order but it
runs by itself).

When you update a row in PostgreSQL, the server inserts the "new"
version of the row but also keeps the "old" row available until it is no
longer visible to any running queries after which that space can be
reclaimed. Locating reclaimable space and making it available for reuse
is the job of the autovacuum process. If everything is tuned properly,
PostgreSQL will keep refilling the "holes" in the table on its own.

There are some exceptions. A query that updates all rows in a table as
might happen in some maintenance operations may increase the table-size
more than you desire. This can be corrected using the cluster command.

Also, there are some workloads that have queries that grab chunks of
data, say all records for a given date, where keeping the associated
data physically close can improve performance.

If you are evaluating PG, you might want to post some info on the nature
of the application. People are here because they like and use PostgreSQL
but most are quite open about saying when it isn't an appropriate
solution as well. You may also get tips on avoiding some common pitfalls
that can prevent PostgreSQL from showing its full potential.

Cheers,
Steve

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Steve Crawford 2011-07-20 21:16:32 Re: How frequently to defrag(cluster)
Previous Message A J 2011-07-20 19:58:18 How frequently to defrag(cluster)