Re: A thought on Index Organized Tables

From: Greg Stark <gsstark(at)mit(dot)edu>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Simon Riggs <simon(at)2ndquadrant(dot)com>, heikki(dot)linnakangas(at)enterprisedb(dot)com, Gokulakannan Somasundaram <gokul007(at)gmail(dot)com>, Karl Schnaitter <karlsch(at)gmail(dot)com>, pgsql-hackers list <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: A thought on Index Organized Tables
Date: 2010-02-24 18:12:47
Message-ID: 407d949e1002241012p5882cef8j6122a6f56d5a8ac1@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Feb 24, 2010 at 5:46 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov> writes:
>> Greg Stark <gsstark(at)mit(dot)edu> wrote:
>>> That doesn't work because when you split an index page any
>>> sequential scan in progress will either see the same tuples twice
>>> or will miss some tuples depending on where the new page is
>>> allocated. Vacuum has a clever trick for solving this but it
>>> doesn't work for arbitrarily many concurrent scans.
>
>> It sounds like you're asserting that Index Scan nodes are inherently
>> unreliable, so I must be misunderstanding you.
>
> We handle splits in a manner that insures that concurrent index-order
> scans remain consistent.  I'm not sure that it's possible to scale that
> to ensure that both index-order and physical-order scans would remain
> consistent.  It might be soluble but it's certainly something to worry
> about.

It might be slightly easier given the assumption that you only want to
scan leaf tuples.

But there's an additional problem I didn't think of before. Currently
we optimize index scans by copying all relevant tuples to local memory
so we don't need to hold an index lock for an extended time or spend a
lot of time relocking and rechecking the index for changes. That
wouldn't be possible if we needed to get visibility info from the page
since we would need up-to-date information.

--
greg

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Joshua D. Drake 2010-02-24 18:17:13 Re: pg_stop_backup does not complete
Previous Message Bruce Momjian 2010-02-24 18:11:10 Re: [BUGS] BUG #4887: inclusion operator (@>) on tsqeries behaves not conforming to documentation