Re: Page at a time index scan

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc: pgsql-patches(at)postgresql(dot)org, Simon Riggs <simon(at)2ndquadrant(dot)com>
Subject: Re: Page at a time index scan
Date: 2006-05-05 18:01:37
Message-ID: 26626.1146852097@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-patches

I wrote:
> BTW, I just realized another bug in the patch: btbulkdelete fails to
> guarantee that it visits every page in the index. It was OK for
> btvacuumcleanup to ignore pages added to the index after it starts,
> but btbulkdelete has to deal with such pages.

Actually, as written this patch does not work. At all. btbulkdelete
has to guarantee that it removes every index entry it's told to, and
it cannot guarantee that in the presence of concurrent page splits.
A split could move index items from a page that btbulkdelete hasn't
visited to one it's already passed over. This is not possible with an
index-order traversal (because splits only move items to the right)
but it's definitely possible with a physical-order traversal.

I was toying with the idea of remembering deletable pages (which
btvacuumcleanup does anyway), which are the only ones that page splits
could move items to, and then rescanning those after the completion
of the primary pass. This has a couple of pretty unpleasant
consequences though:
* We have to remember *every* deletable page for correctness, compared
to the current situation where btvacuumcleanup can bound the number of
pages it tracks. This creates a situation where VACUUM may fail
outright if it doesn't have gobs of memory. Since one of the main
reasons for developing lazy VACUUM was to ensure we could vacuum
arbitrarily large tables in bounded memory, I'm not happy with this.
* The rescan could be far from cheap if there are many such pages.

Also, I'm not sure when you can stop: it certainly seems possible that
items could be moved down during the primary scan and then moved down
again while btbulkdelete is reconsidering the deletable pages. Without
locking out splits entirely, it's far from obvious that we can make it
work.

Thoughts?

regards, tom lane

In response to

Responses

Browse pgsql-patches by date

  From Date Subject
Next Message Martijn van Oosterhout 2006-05-05 18:07:06 Re: Have configure complain about unknown options
Previous Message Sven Suursoho 2006-05-05 17:58:26 Re: plpython improvements