Skip site navigation (1) Skip section navigation (2)

Re: Clustered/covering indexes (or lack thereof :-)

From: Bill Moran <wmoran(at)collaborativefusion(dot)com>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: adrobj <adrobj(at)yahoo(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Clustered/covering indexes (or lack thereof :-)
Date: 2007-11-16 19:51:35
Message-ID: 20071116145135.af61f4b1.wmoran@collaborativefusion.com (view raw or flat)
Thread:
Lists: pgsql-performance
In response to Jeff Davis <pgsql(at)j-davis(dot)com>:

> On Sun, 2007-11-11 at 22:59 -0800, adrobj wrote:
> > This is probably a FAQ, but I can't find a good answer...
> > 
> > So - are there common techniques to compensate for the lack of
> > clustered/covering indexes in PostgreSQL? To be more specific - here is my
> > table (simplified):
> > 
> > topic_id int
> > post_id int
> > post_text varchar(1024)
> > 
> > The most used query is: SELECT post_id, post_text FROM Posts WHERE
> > topic_id=XXX. Normally I would have created a clustered index on topic_id,
> > and the whole query would take ~1 disk seek.
> > 
> > What would be the common way to handle this in PostgreSQL, provided that I
> > can't afford 1 disk seek per record returned?
> > 
> 
> Periodically CLUSTER the table on the topic_id index. The table will not
> be perfectly clustered at all times, but it will be close enough that it
> won't make much difference.
> 
> There's still the hit of performing a CLUSTER, however.
> 
> Another option, if you have a relatively small number of topic_ids, is
> to break it into separate tables, one for each topic_id.

Or materialize the data, if performance is the utmost requirement.

Create second table:
materialized_topics (
 topic_id int,
 post_ids int[],
 post_texts text[]
)

Now add a trigger to your original table that updates materialized_topics
any time the first table is altered.  Thus you always have fast lookups.

Of course, this may be non-optimal if that table sees a lot of updates.

-- 
Bill Moran
Collaborative Fusion Inc.
http://people.collaborativefusion.com/~wmoran/

wmoran(at)collaborativefusion(dot)com
Phone: 412-422-3463x4023

In response to

pgsql-performance by date

Next:From: Josh TrutwinDate: 2007-11-16 20:36:50
Subject: Re: PostgreSQL vs MySQL, and FreeBSD
Previous:From: Jeff DavisDate: 2007-11-16 19:34:36
Subject: Re: Clustered/covering indexes (or lack thereof :-)

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group