Re: We need index-only scans

From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Greg Stark <gsstark(at)mit(dot)edu>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
Subject: Re: We need index-only scans
Date: 2010-11-12 14:39:35
Message-ID: 4CDD51A7.6060406@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 11/12/2010 09:17 AM, Bruce Momjian wrote:
> Greg Stark wrote:
>> On Wed, Nov 10, 2010 at 4:04 AM, Bruce Momjian<bruce(at)momjian(dot)us> wrote:
>>> We last researched index-only scans, also called covering indexes, in
>>> September of 2008, but have made little progress on it since. ?Many have
>>> been waiting for Heikki to implement this but I talked to him and he
>>> doesn't have time.
>>>
>>> I believe it is time for the community to move forward and I would like
>>> to assemble a team to work on this feature. ?We might not be able to
>>> implement it for Postgres 9.1, but hopefully we can make some progress
>>> on this.
>> Just so everyone is on the same page.... Even once we have index-only
>> scans they won't be anywhere near as useful with Postgres as they are
>> with Oracle and other databases. At least not unless we find a
>> solution for a different problem -- our inability to scan btree
>> indexes sequentially.
>>
>> In Oracle "Fast Full Index" scans are particularly useful for things
>> like unconstrained select count(*). Since the scan can scan through
>> the index sequentially and the index is much smaller than the table it
>> can count all the values fairly quickly even on a very wide table.
>>
>> In Postgres, aside from the visibility issues we have a separate
>> problem. In order to achieve high concurrency we allow splits to occur
>> without locking the index. And the new pages can be found anywhere in
>> the index, even to the left of the existing page. So a sequential scan
>> could miss some data if the page it's on is split and some of the data
>> is moved to be to the left of where our scan is.
>>
>> It's possible this is a non-issue in the future due to large RAM sizes
>> and SSDs. Large amounts of RAM mean perhaps indexes will be in memory
>> much of the time and SSDs might mean that scanning the btree in index
>> order might not really be that bad.
> Agreed. I updated the index-only scans wiki for this:
>
> http://wiki.postgresql.org/wiki/Index-only_scans
>
> test speed improvement for scans of the entire index (this involves
> random I/O)
> * we can't scan the index in physical order like vacuum does

For unconstrained select count(*), why does scanning in index order matter?

cheers

andrew

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2010-11-12 14:44:35 Re: MULTISET and additional functions for ARRAY
Previous Message Kevin Grittner 2010-11-12 14:28:27 Re: multi-platform, multi-locale regression tests