Re: index-only scans

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: index-only scans
Date: 2011-10-08 00:14:53
Message-ID: 20888.1318032893@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> Please find attached a patch implementing a basic version of
> index-only scans. This patch is the work of my colleague Ibrar Ahmed
> and myself, and also incorporates some code from previous patches
> posted by Heikki Linnakanagas.

I've committed this after some rather substantial editorialization.
There's still a lot left to do of course, but it seems to need
performance testing next, and that'll be easier if the code is in HEAD.

> 1. The way that nodeIndexscan.c builds up the faux heap tuple is
> perhaps susceptible to improvement. I thought about building a
> virtual tuple, but then what do I do with an OID column, if I have
> one? Or maybe this should be done some other way altogether.

I switched it to use a virtual tuple for now, and just not attempt to
use index-only scans if a system column is required. We're likely to
want to rethink this anyway, because as currently constituted the code
can't do anything with an expression index, and avoiding recalculation
of an expensive function could be a nice win. But the approach of
just building a faux heap tuple fundamentally doesn't work for that.

> 2. Suppose we scan one tuple on a not-all-visible page followed by 99
> tuples on all-visible pages. The code as written will hold the pin on
> the first heap page for the entire scan. As soon as we hit the end of
> the scan or another tuple where we have to actually visit the page,
> the old pin will be released, but until then we hold onto it.

I did not do anything about this issue --- ISTM it needs performance
testing.

> 3. The code in create_index_path() builds up a bitmapset of heap
> attributes that get used for any purpose anywhere in the query, and
> hangs it on the RelOptInfo so it doesn't need to be rebuilt for every
> index under consideration. However, if it were somehow possible to
> have the rel involved without using any attributes at all, we'd
> rebuild the cache over and over, since it would never become non-NULL.

I dealt with this by the expedient of getting rid of the caching ;-).
It's not clear to me that it was worth the trouble, and in any case
it's fundamentally wrong to suppose that every index faces the same
set of attributes it must supply. It should not need to supply columns
that are only needed in indexquals or index predicate conditions.
I'm not sure how to deal with those refinements cheaply enough, but
the cache isn't helping.

> 4. There are a couple of cases that use index-only scans even though
> the EXPLAIN output sort of makes it look like they shouldn't. For
> example, in the above queries, an index-only scan is chosen even
> though the query does "SELECT *" from the table being scanned. Even
> though the EXPLAIN (VERBOSE) output makes it look otherwise, it seems
> that the target list of an EXISTS query is in fact discarded, e.g.:

The reason it looks that way is that we're choosing to use a "physical
result tuple" to avoid an ExecProject step at runtime. There's nothing
wrong with the logic, it's just that EXPLAIN shows something that might
mislead people.

> 5. We haven't made any planner changes at all, not even for cost
> estimation. It is not clear to me what the right way to do cost
> estimation here is.

Yeah, me either. For the moment I put in a hard-wired estimate that
only 90% of the heap pages would actually get fetched. This is
conservative and only meant to ensure that the planner picks an
index-only-capable plan over an indexscan with a non-covering index,
all else being equal. We'll need to do performance testing before
we can refine that.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2011-10-08 01:10:53 Re: Why does WAL_DEBUG macro need to be defined by default?
Previous Message Noah Misch 2011-10-07 22:07:19 Re: [v9.2] Fix Leaky View Problem