Re: cluster index on a table

From: Greg Stark <gsstark(at)mit(dot)edu>
To: Scott Carey <scott(at)richrelevance(dot)com>
Cc: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: cluster index on a table
Date: 2009-07-16 20:49:26
Message-ID: 407d949e0907161349r110ddd22v568ccfbf0281a682@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Thu, Jul 16, 2009 at 9:06 PM, Scott Carey<scott(at)richrelevance(dot)com> wrote:
> Keep the old page around or a copy of it that old transactions reference?
> Just more Copy on Write.
> How is that different from a nested loop on an index scan/seek currently?
> Doesn't an old transaction have to reference an old heap page through an
> index with the current implementation?  At least, the index references
> multiple versions and their visibility must be checked.  Can a similar
> solution work here?  Just reference the pre and post split pages and filter
> by visibility?

That would be a completely different architecture than we have now.
We're not Oracle, Postgres does all this at the tuple level, not at
the page level. We have tuple versions, not page versions, and tuple
locks, not page locks.

The old transaction has to reference a heap page through an index with
the current implementation. But it can do so safely precisely because
the tuple will be where the index references it as long as necessary.
As long as that old transaction is live it's guaranteed not to be
removed by vacuum (well... except by VACUUM FULL but that's a whole
nother story).

Actually this is probably the clearest problem with IOT in the
Postgres universe. What do other indexes use to reference these rows
if they can move around?

I wanted to call Heikki's "grouped index item" patch that he worked on
for so long index organized tables. Basically that's what they are
except the leaf tuples are stored in the regular heap like always,
hopefully in index order. And there are no leaf tuples in the index so
the actual index is much much smaller. It doesn't look like a
traditional IOT but it behaves a lot like one in the space savings and
access patterns.

--
greg
http://mit.edu/~gsstark/resume.pdf

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Marc Cousin 2009-07-16 20:50:10 Re: Very big insert/join performance problem (bacula)
Previous Message Dimitri Fontaine 2009-07-16 20:43:29 Re: Repeated Query is much slower in PostgreSQL8.2.4 than DB2 9.1