Re: index-only scans

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: index-only scans
Date: 2011-10-06 19:46:07
Message-ID: CA+Tgmob1HVRmONeD4VgA0rX6sQrW-gc+GiMsOxqMWh2hCgzkOQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Oct 6, 2011 at 3:15 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> Please find attached a patch implementing a basic version of
>> index-only scans.  This patch is the work of my colleague Ibrar Ahmed
>> and myself, and also incorporates some code from previous patches
>> posted by Heikki Linnakanagas.
>
> I'm starting to review this patch now.

Thanks!

> Has any further work been done
> since the first version was posted?  Also, was any documentation
> written?  I'm a tad annoyed by having to reverse-engineer the changes
> in the AM API specification from the code.

Not really. We have detected a small performance regression when both
heap and index fit in shared_buffers and an index-only scan is used.
I have a couple of ideas for improving this. One is to store a
virtual tuple into the slot instead of building a regular tuple, but
what do we do about tuples with OIDs? Another is to avoid locking the
index buffer multiple times - right now it locks the index buffer to
get the TID, and then relocks it to extract the index tuple (after
checking that nothing disturbing has happened meanwhile). It seems
likely that with some refactoring we could get this down to a single
lock/unlock cycle, but I haven't figured out exactly where the TID
gets copied out.

With regard to the AM API, the basic idea is we're just adding a
Boolean to say whether the AM is capable of returning index tuples.
If it's not, then we don't ever try an index-only scan. If it is,
then we'll set the want_index_tuple flag if an index-only scan is
possible. This requests that the AM attempt to return the tuple; but
the AM is also allowed to fail and not return the tuple whenever it
wants. This is more or less the interface that Heikki suggested a
couple years back, but it might well be vulnerable to improvement.

Incidentally, if you happen to feel the urge to beat this around and
send it back rather than posting a list of requested changes, feel
free.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Magnus Hagander 2011-10-06 19:48:55 Re: Bug in walsender when calling out to do_pg_stop_backup (and others?)
Previous Message Alexander Korotkov 2011-10-06 19:26:09 Re: Range Types - typo + NULL string constructor