Re: We need index-only scans

From: Greg Stark <gsstark(at)mit(dot)edu>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
Subject: Re: We need index-only scans
Date: 2010-11-12 13:33:28
Message-ID: AANLkTi=+XvG8d-rAP2Z0-DB=Ek2_8v20pY8ssKW0Nd-L@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Nov 10, 2010 at 4:04 AM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> We last researched index-only scans, also called covering indexes, in
> September of 2008, but have made little progress on it since.  Many have
> been waiting for Heikki to implement this but I talked to him and he
> doesn't have time.
>
> I believe it is time for the community to move forward and I would like
> to assemble a team to work on this feature.  We might not be able to
> implement it for Postgres 9.1, but hopefully we can make some progress
> on this.

Just so everyone is on the same page.... Even once we have index-only
scans they won't be anywhere near as useful with Postgres as they are
with Oracle and other databases. At least not unless we find a
solution for a different problem -- our inability to scan btree
indexes sequentially.

In Oracle "Fast Full Index" scans are particularly useful for things
like unconstrained select count(*). Since the scan can scan through
the index sequentially and the index is much smaller than the table it
can count all the values fairly quickly even on a very wide table.

In Postgres, aside from the visibility issues we have a separate
problem. In order to achieve high concurrency we allow splits to occur
without locking the index. And the new pages can be found anywhere in
the index, even to the left of the existing page. So a sequential scan
could miss some data if the page it's on is split and some of the data
is moved to be to the left of where our scan is.

It's possible this is a non-issue in the future due to large RAM sizes
and SSDs. Large amounts of RAM mean perhaps indexes will be in memory
much of the time and SSDs might mean that scanning the btree in index
order might not really be that bad.

--
greg

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Markus Wanner 2010-11-12 13:35:03 Re: multi-platform, multi-locale regression tests
Previous Message Kevin Grittner 2010-11-12 13:27:31 Re: multi-platform, multi-locale regression tests