On Thu, Aug 26, 2004 at 11:39:42PM -0400, Greg Stark wrote:
> Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> writes:
> > Updated TODO item:
> > o Automatically maintain clustering on a table
> > This would require some background daemon to maintain clustering
> > during periods of low usage. It might also require tables to be only
> > paritally filled for easier reorganization. It also might require
> > creating a merged heap/index data file so an index lookup would
> > automatically access the heap data too.
> Fwiw, I would say the first "would" is also a "might". None of the previous
> discussions here presumed a maintenance daemon. The discussions before talked
> about a mechanism to try to place new tuples as close as possible to the
> proper index position.
> I would also suggest making some distinction between a cluster system similar
> to what we have now but improved to maintain the clustering continuously, and
> an actual index-organized-table where the tuples are actually only stored in a
> btree structure.
> They're two different approaches to similar problems. But they might both be
> useful to have, and have markedly different implementation details.
There's a third approach that I think is worth considering. Half of the
benefit to clustered tables is limiting the number of pages you need to
access when scanning the primary key. The order of tuples in the pages
themselves isn't nearly as important as ordering of the pages. This
means you can get most of the benefit of an index-organized table just
by being careful about what page you place a tuple on. What I'm thinking
of is some means to ensure all the tuples on a page are within some PK
range, but not worrying about the exact order within the page since it's
relatively cheap to scan through the page in memory.
This would probably mean less change to the code that inserts tuples.
No need for a background daemon.
No need to create a new B-Tree table structure.
Ideally, there won't be need to move tuples around, which should mean
that current indexing code doesn't need to change.
Need to have some way to deal with pages that fill up.
To gain full benefit some means of indicating what range of PK values
are on a page might be needed.
It's not as beneficial as a true IOT since you don't get the benefit of
storing your tuples inline with your B-Tree.
I'm sure there's a ton of things I'm missing, especially since I'm not
familiar with the postgresql code, but hopefully others can explore this
Jim C. Nasby, Database Consultant decibel(at)decibel(dot)org
Give your computer some brain candy! www.distributed.net Team #1828
Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"
In response to
pgsql-performance by date
|Next:||From: Ron St-Pierre||Date: 2004-09-01 00:18:14|
|Subject: Re: Table UPDATE is too slow|
|Previous:||From: Josh Berkus||Date: 2004-08-31 22:03:05|
|Subject: Re: Context Switching issue: Spinlock doesn't fix.|