Re: cluster index on a table

From: Scott Carey <scott(at)richrelevance(dot)com>
To: Greg Stark <gsstark(at)mit(dot)edu>
Cc: Justin Pitts <justinpitts(at)gmail(dot)com>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>, Ibrahim Harrani <ibrahim(dot)harrani(at)gmail(dot)com>, Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>
Subject: Re: cluster index on a table
Date: 2009-07-16 20:06:10
Message-ID: C684D642.A42D%scott@richrelevance.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 7/16/09 12:46 PM, "Greg Stark" <gsstark(at)mit(dot)edu> wrote:

> On Thu, Jul 16, 2009 at 8:18 PM, Scott Carey<scott(at)richrelevance(dot)com> wrote:
>> " Each index row in the nonclustered index contains the nonclustered key
>> value and a row locator. This locator points to the data row in the
>> clustered index or heap having the key value."
>>
>> That sort of model should work with MVCC and even HOT with the same
>> restrictions that HOT has now.
>
>
> The problem with this is that btree indexes need to be able to split
> pages. In which case your tuple's tid changes and all hell breaks
> loose. One of the fundamental design assumptions in our MVCC design is
> that you can trust a tuple to stay where it is as long as it's visible
> to your transaction.
>
> For example you may want to go back and check the discussion on
> getting vacuum to do a sequential scan of indexes. The solution we
> found for that only works because only a single vacuum can be scanning
> the index at a time.
>
> Another scenario to think about, picture yourself in the middle of a
> nested loop processing all the matches for a tuple in the outer
> relation. Now someone else comes along and wants to insert a new tuple
> on the same page as that outer tuple and has to split the page. How do
> you do that without messing up the nested loop which may not come back
> to that page for many minutes?
>

Keep the old page around or a copy of it that old transactions reference?
Just more Copy on Write.
How is that different from a nested loop on an index scan/seek currently?
Doesn't an old transaction have to reference an old heap page through an
index with the current implementation? At least, the index references
multiple versions and their visibility must be checked. Can a similar
solution work here? Just reference the pre and post split pages and filter
by visibility?

>
> --
> greg
> http://mit.edu/~gsstark/resume.pdf
>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Kevin Grittner 2009-07-16 20:07:25 Re: Very big insert/join performance problem (bacula)
Previous Message Greg Stark 2009-07-16 19:46:00 Re: cluster index on a table