Re: Index Onlys Scan for expressions

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Index Onlys Scan for expressions
Date: 2016-08-15 22:37:34
Message-ID: 31014886-8607-0b3b-66ec-b5947d117ccf@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 08/16/2016 12:03 AM, Ildar Musin wrote:
> Hi, hackers!
>
> There is a known issue that index only scan (IOS) can only work with
> simple index keys based on single attributes and doesn't work with index
> expressions. In this patch I propose a solution that adds support of IOS
> for index expressions. Here's an example:
>
> create table abc(a int, b int, c int);
> create index on abc ((a * 1000 + b), c);
>
> with t1 as (select generate_series(1, 1000) as x),
> t2 as (select generate_series(0, 999) as x)
> insert into abc(a, b, c)
> select t1.x, t2.x, t2.x from t1, t2;
> vacuum analyze;
>
> Explain results with the patch:
>
> explain (analyze, buffers) select a * 1000 + b + c from abc where a *
> 1000 + b = 1001;
> QUERY PLAN
> -------------------------------------------------------------------------------------------------------------------------
>
> Index Only Scan using abc_expr_c_idx on abc (cost=0.42..4.45 rows=1
> width=4) (actual time=0.032..0.033 rows=1 loops=1)
> Index Cond: ((((a * 1000) + b)) = 1001)
> Heap Fetches: 0
> Buffers: shared hit=4
> Planning time: 0.184 ms
> Execution time: 0.077 ms
> (6 rows)
>
> Before the patch it was:
>
> explain (analyze, buffers) select a * 1000 + b + c from abc where a *
> 1000 + b = 1001;
> QUERY PLAN
> --------------------------------------------------------------------------------------------------------------------
>
> Index Scan using abc_expr_c_idx on abc (cost=0.42..8.45 rows=1
> width=4) (actual time=0.039..0.041 rows=1 loops=1)
> Index Cond: (((a * 1000) + b) = 1001)
> Buffers: shared hit=4
> Planning time: 0.177 ms
> Execution time: 0.088 ms
> (5 rows)
>

Nice! I've only quickly skimmed through the diff, but it seems sane.
Please add the patch to the 2016-09 CF, though.

> This solution has limitations though: the restriction or the target
> expression tree (or its part) must match exactly the index. E.g. this
> expression will pass the check:
>
> select a * 1000 + b + 100 from ...
>
> but this will fail:
>
> select 100 + a * 1000 + b from ...
>
> because the parser groups it as:
>
> (100 + a * 1000) + b
>
> In this form it won't match any index key. Another case is when we
> create index on (a+b) and then make query like 'select b+a ...' or '...
> where b+a = smth' -- it won't match. This applies to regular index scan
> too. Probably it worth to discuss the way to normalize index expressions
> and clauses and work out more convenient way to match them.
> Anyway, I will be grateful if you take a look at the patch in
> attachment. Any comments and tips are welcome.

I don't think it's a major limitation - it's quite similar to the
limitation for partial indexes, i.e. with an index defined like

CREATE INDEX ON abc (c) WHERE a + b = 1000;

the index will not be used unless the query expression matches exactly.
So for example this won't work:

SELECT c FROM abc WHERE b + a = 1000;

because the variables are in the opposite order. Moreover, in the target
list it might be possible to use explicit parentheses to make it work,
no? That is, will this work?

select 100 + (a * 1000 + b) from ...

Or will it still break the IOS?

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2016-08-15 22:45:35 Re: Let's get rid of the separate minor version numbers for shlibs
Previous Message Robert Haas 2016-08-15 22:28:53 Re: max_parallel_degree > 0 for 9.6 beta