Re: PATCH: index-only scans with partial indexes

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PATCH: index-only scans with partial indexes
Date: 2015-09-04 16:10:47
Message-ID: CAMkU=1y9JGr4=ATZJ7RoyoHWx-eK7fhjECbEmHkeXo7bD37poA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Sep 4, 2015 at 4:28 AM, Anastasia Lubennikova <
a(dot)lubennikova(at)postgrespro(dot)ru> wrote:

>
>
> 25.08.2015 20:19, Jeff Janes пишет:
>
> On Fri, Jul 10, 2015 at 11:29 AM, Tomas Vondra <
> tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
>
>> Hi,
>>
>> currently partial indexes end up not using index only scans in most
>> cases, because check_index_only() is overly conservative, as explained in
>> this comment:
>>
>> * XXX this is overly conservative for partial indexes, since we will
>> * consider attributes involved in the index predicate as required even
>> * though the predicate won't need to be checked at runtime. (The same
>> * is true for attributes used only in index quals, if we are certain
>> * that the index is not lossy.) However, it would be quite expensive
>> * to determine that accurately at this point, so for now we take the
>> * easy way out.
>>
>> In other words, unless you include columns from the index predicate to
>> the index, the planner will decide index only scans are not possible. Which
>> is a bit unfortunate, because those columns are not needed at runtime, and
>> will only increase the index size (and the main benefit of partial indexes
>> is size reduction).
>>
>> The attached patch fixes this by only considering clauses that are not
>> implied by the index predicate. The effect is simple:
>>
>> create table t as select i as a, i as b from
>> generate_series(1,10000000) s(i);
>>
>> create index tidx_partial on t(b) where a > 1000 and a < 2000;
>>
>> vacuum freeze t;
>> analyze t;
>>
>> explain analyze select count(b) from t where a > 1000 and a < 2000;
>>
>
>
> However, "explain analyze select sum(b) from t where a > 1000 and a <
> 1999;" still doesn't use the index only
> scan. Isn't that also implied by the predicate?
>
>
> In this example it doesn't use IndexOnlyScan correctly. If I understand
> partial indexes right, if index predicate and search clause are not equal,
> index scan must recheck values when it's fetching them.
> 'tidx_partial' in example above has no information about 'a' attribute,
> beside the index->indpred, so it is impossible to recheck qual without
> referencing to table.
>
> In example:
> create index tidx_partial on t(a) where a > 1000 and a < 2000;
> explain analyze select sum(a) from t where a > 1000 and a < 1999;
> it can use IndexOnlyScan.
>

Yes, of course. Thanks for the explanation, it is obvious now that you
have explained it. I kept slipping into thinking that the
predicate-dependent variables are included in the index but only when the
predicate is met, but that isn't the case.

How can we evaluate Tom's performance concerns? I tried
turning log_planner_stats on and using the regression test as a load
generator, but I don't think that that is very demanding of a test.

Thanks,

Jeff

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Stark 2015-09-04 16:21:38 Re: Allow a per-tablespace effective_io_concurrency setting
Previous Message Daniel Verite 2015-09-04 16:08:10 Re: [patch] Proposal for \rotate in psql