Re: Grouped Index Tuples / Clustered Indexes

From: "Simon Riggs" <simon(at)2ndquadrant(dot)com>
To: "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
Cc: "Gregory Stark" <stark(at)enterprisedb(dot)com>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Grouped Index Tuples / Clustered Indexes
Date: 2007-03-11 19:54:56
Message-ID: 1173642897.3641.465.camel@silverbirch.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, 2007-03-11 at 11:22 +0000, Heikki Linnakangas wrote:
> Gregory Stark wrote:
> >> On Wed, 2007-03-07 at 10:32 +0000, Heikki Linnakangas wrote:
> >>> I've been thinking
> >>> we should call this feature just Clustered Indexes
> >
> > So we would have "clustered tables" which are tables whose heap is ordered
> > according to an index and separately "clustered indexes" which are indexes
> > optimized for such tables?
>
> Yes, that's what I was thinking.
>
> There's a third related term in use as well. When you issue CLUSTER, the
> table will be clustered on an index. And that index is then the "index
> the table is clustered on". That's a bit cumbersome but that's the
> terminology we're using at the moment. Maybe we should to come up with a
> new term for that to avoid confusion..

First thought: we can use the term "cluster*ing* index" for CLUSTER and
use the term "clustered" to refer to what has happened to the table and
the index. That will probably be confused with high availability
clustering, so perhaps not.

Better thought: say that CLUSTER requires an "order-defining index".
That better explains the point that it is the table being clustered,
using the index to define the physical order of the rows in the heap. We
then use the word "clustered" to refer to what has happened to the
table, and with this patch, for the index also.

That way we can have new syntax for CLUSTER

CLUSTER table ORDER BY indexname

which is then the preferred syntax, rather than the perverse

CLUSTER index ON table

which gives the wrong impression about what is happening, since it is
the table that is changed, not the index.

- - -

- Are you suggesting that we have an explicit new syntax

CREATE [UNIQUE] CLUSTERED INDEX [CONCURRENTLY] fooidx ON foo (....) ...

or just that we refer to this feature as Clustered Indexes?

- Do we still need the index WITH option, in either case?

- Do you think that all Primary Keys should be clustered?

- Are you thinking to rename docs, catalog etc to reflect the new
naming/meaning?

My thinking would be: CLUSTERED, no, yes, yes
but I'd like to know what you think?

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2007-03-11 19:56:48 Re: Grouped Index Tuples / Clustered Indexes
Previous Message Josh Berkus 2007-03-11 19:47:51 Re: My honours project - databases using dynamically attached entity-properties