Re: questions about CLUSTER

From: Bill Moran <wmoran(at)collaborativefusion(dot)com>
To: Douglas J Hunley <doug(at)hunley(dot)homeip(dot)net>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: questions about CLUSTER
Date: 2008-02-27 17:40:57
Message-ID: 20080227124057.9b3937ce.wmoran@collaborativefusion.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

In response to Douglas J Hunley <doug(at)hunley(dot)homeip(dot)net>:

> After reviewing http://www.postgresql.org/docs/8.3/static/sql-cluster.html a
> couple of times, I have some questions:
> 1) it says to run analyze after doing a cluster. i'm assuming autovacuum will
> take care of this? or should i go ahead and do the analyze 'now' instead of
> waiting?

It's always a good idea to analyze after major DB operations. Autovacuum
only runs so often. Also, it won't hurt anything, so why risk not doing
it?

> 2) is there any internal data in the db that would allow me to
> programmatically determine which tables would benefit from being clustered?
> 3) for that matter, is there info to allow me to determine which index it
> should be clustered on in cases where the table has more than one index?

The pg_stat_user_indexes table keeps stats on how often the index is used.
Indexes that are used frequently are candidates for clustering.

> 4) for tables with >1 indexes, does clustering on one index negatively impact
> queries that use the other indexes?

Not likely. Clustering only really helps performance if you have an index
that is used to gather ranges of data. For example, if you frequently
do things like SELECT * FROM log WHERE logdate > 'somedate" and < 'somedate,
you might benefit from clustering on logdate.

But it doesn't really do much if you're only ever pulling one record at a
time. It's the kind of thing that you really need to experiment with to
understand whether it will have a worthwhile performance impact on your
data and your workload. I doubt if there's any pat answer.

> 5) is it better to cluster on a compound index (index on lastnamefirstname) or
> on the underlying index (index on lastname)?

If cluster helps you at all, it's going to help if you have an index that's
frequently used to fetch ranges of data. Whether that index is compound or
not isn't likely to factor in.

--
Bill Moran
Collaborative Fusion Inc.
http://people.collaborativefusion.com/~wmoran/

wmoran(at)collaborativefusion(dot)com
Phone: 412-422-3463x4023

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Douglas J Hunley 2008-02-27 18:35:16 Re: questions about CLUSTER
Previous Message Laurent Raufaste 2008-02-27 17:38:48 Re: PG planning randomly ?