Re: Grouped Index Tuples / Clustered Indexes

From: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Gregory Stark <stark(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Grouped Index Tuples / Clustered Indexes
Date: 2007-03-12 12:53:09
Message-ID: 45F54D35.7090204@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Simon Riggs wrote:
> Better thought: say that CLUSTER requires an "order-defining index".
> That better explains the point that it is the table being clustered,
> using the index to define the physical order of the rows in the heap. We
> then use the word "clustered" to refer to what has happened to the
> table, and with this patch, for the index also.
>
> That way we can have new syntax for CLUSTER
>
> CLUSTER table ORDER BY indexname
>
> which is then the preferred syntax, rather than the perverse
>
> CLUSTER index ON table
>
> which gives the wrong impression about what is happening, since it is
> the table that is changed, not the index.

I like that, "order-defining index" conveys the point pretty well.

> - Are you suggesting that we have an explicit new syntax
>
> CREATE [UNIQUE] CLUSTERED INDEX [CONCURRENTLY] fooidx ON foo (....) ...
>
> or just that we refer to this feature as Clustered Indexes?

I'm not proposing new syntax, just a WITH-parameter. Makes more sense to
me that way, the clusteredness has no user-visible effects except
performance, and it's b-tree specific (though I guess you could apply
the same concept to other indexams as well).

> - Do you think that all Primary Keys should be clustered?

No. There's a significant CPU overhead when the index and table are in
memory and you're doing simple one-row lookups. And there's no promise
that a table is physically in primary key order anyway.

There might be some interesting cases where we could enable it
automatically. I've been thinking that if you explicitly CLUSTER a
table, the order-defining index would definitely benefit from being a
clustered index. If it's small enough that it fits in memory, there's no
point in running CLUSTER in the first place. And if you run CLUSTER, we
know it's in order. That seems like a pretty safe bet.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Gaetano Mendola 2007-03-12 13:01:05 Re: Calculated view fields (8.1 != 8.2)
Previous Message Heikki Linnakangas 2007-03-12 12:25:18 Bitmapscan changes