Re: We need index-only scans

From: MARK CALLAGHAN <mdcallag(at)gmail(dot)com>
To: Kristian Nielsen <knielsen(at)knielsen-hq(dot)org>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: We need index-only scans
Date: 2010-12-04 03:26:27
Message-ID: AANLkTinVNF8WEoKoqAAsDPynTGSqk5tBGT-vjETePnTc@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Dec 1, 2010 at 3:53 AM, Kristian Nielsen
<knielsen(at)knielsen-hq(dot)org> wrote:
> Greg Stark <gsstark(at)mit(dot)edu> writes:
>
>> Just so everyone is on the same page.... Even once we have index-only
>> scans they won't be anywhere near as useful with Postgres as they are
>> with Oracle and other databases. At least not unless we find a
>> solution for a different problem -- our inability to scan btree
>> indexes sequentially.
>
> True, however I would not be too pessimistic about this.
>
> For OLTP like typical web applications, index-only scans are a killer feature
> for being able to read N rows with 1 I/O (for some small N), when the data no
> longer fits in the buffer pool, or after cold start.
>
> Eg. read all account settings for one user account, or subjects of all
> messages, etc. A composite index with user_id in the first column allows to
> fetch all N rows from one (or a few) disk pages with an index-only scan, as
> opposed to N disk pages.
>
> So for this, index-only scans can make a _big_ difference, even without
> support for Oracle-type index fast-full-scans.

I am not trying start a MySQL vs PostgreSQL thread. I lurk on these
lists to learn more about PostgreSQL.

I know that PostgreSQL is good at OLTP and complex query processing
and that index fast-full scans can make a big difference for large
joins, but the workload that I care about is OLTP-only. PostgreSQL
will be more efficient on that workload with support for index-only
scans. The majority of the load is simple queries -- joins that touch
a few rows, short index range scans and index point lookups. With
covering indexes and InnoDB the queries do a few disk reads in the
worst case. Without covering indexes the queries do extra disk IO in
the worst case (buffer pool read miss) and this is much worse for the
range scans. I assume that the behavior with covering indexes but
without index-only scans is similar to not having index-only scans.

I collect 95th percentile response times for my popular queries and
they are much improved with the use of covering indexes.

--
Mark Callaghan
mdcallag(at)gmail(dot)com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Vaibhav Kaushal 2010-12-04 03:49:02 Requirement for the Buffer manager to load multiple pages at once
Previous Message Tatsuo Ishii 2010-12-04 02:18:20 Streaming replication document