RE: Index Skip Scan

From: Floris Van Nee <florisvannee(at)Optiver(dot)com>
To: David Rowley <dgrowleyml(at)gmail(dot)com>, Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com>
Cc: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, "pg(at)bowt(dot)ie" <pg(at)bowt(dot)ie>, "jesper(dot)pedersen(at)redhat(dot)com" <jesper(dot)pedersen(at)redhat(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, "jtc331(at)gmail(dot)com" <jtc331(at)gmail(dot)com>, "rafia(dot)pghackers(at)gmail(dot)com" <rafia(dot)pghackers(at)gmail(dot)com>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, "bhushan(dot)uparkar(at)gmail(dot)com" <bhushan(dot)uparkar(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Floris van Nee <floris(dot)vannee(at)gmail(dot)com>
Subject: RE: Index Skip Scan
Date: 2020-03-21 22:00:01
Message-ID: c5c5c974714a47f1b226c857699e8680@opammb0561.comp.optiver.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello hackers,

Recently I've put some effort in extending the functionality of this patch. So far, we've been trying to keep the scope of this patch relatively small to DISTINCT-clauses only. The advantage of this approach was that it keeps impact to the indexam api to a minimum. However, given the problems we've been facing in getting the implementation to work correctly in all cases, I started wondering if this implementation was the right direction to go in. My main worry is that the current indexam api for skipping is not suited to other future use cases of skipping, but also that we're already struggling with it now to get it to work correctly in all edge cases.

In the approach taken so far, the amskip function is defined taking two ScanDirection parameters. The function amgettuple is left unchanged. However, I think we need amgettuple to take two ScanDirection parameters as well (or create a separate function amgetskiptuple). This patch proposes that.

Currently, I've just added 'skip' functions to the indexam api for beginscan and gettuple. Maybe it'd be better to just modify the existing functions to take an extra parameter instead. Any thoughts on this?

The result is a patch that can apply skipping in many more cases than previous patches. For example, filtering on columns that are not part of the index, properly handling visibility checks without moving these into the nbtree code, skipping not only on prefix but also on extra conditions that become available (eg. prefix a=1 and we happen to have a WHERE clause with b=200, which we can now use to skip all the way to a=1 AND b=200). There's a fair amount of changes in the nbtree code to support this.

Patch 0001 is Jesper's unique keys patch.
Patch 0002 modifies executor-level code to support skip scans and enables it for DISTINCT queries.
Patch 0003 just provides a very basic planner hack that enables the skip scan for practically all index scans (index, index only and bitmap index). This is not what we would normally want, but this way I could easily test the skip scan code. It's so large because I modify all the test cases expected results that now include an extra line 'Skip scan: All'. The actual code changed is only a few lines in this patch.

The planner part of the code still needs work. The planner code in this patch is similar to the previous patch. David's comments about projection paths haven't been addressed yet. Also, there's no proper way of hooking up the index scan for regular (non-DISTINCT) queries yet. That's why I hacked up patch 0003 just to test stuff.

I'd welcome any help on these patches. If someone with more planner knowledge than me is willing to do part of the planner code, please feel free to do so. I believe supporting this will speed up a large number of queries for all kinds of users. It can be a really powerful feature.

Tomas, would you be willing to repeat the performance tests you did earlier? I believe this version will perform better than the previous patch for the cases where you noticed the 10-20x slow-down. There will obviously still be a performance penalty for cases where the planner picks a skip scan that are not well suited, but I think it'll be smaller.

-Floris

-----
To give a few simple examples:

Initialization:
-- table t1 has 100 unique values for a
-- and 10000 b values for each a
-- very suitable for skip scan
create table t1 as select a,b,b%5 as c, random() as d from generate_series(1, 100) a, generate_series(1,10000) b;
create index on t1 (a,b,c);

-- table t2 has 10000 unique values for a
-- and 100 b values for each a
-- this is not very suitable for skip scan
-- because the next matching value is always either
-- on the current page or on the next page
create table t2 as select a,b,b%5 as c, random() as d from generate_series(1, 10000) a, generate_series(1,100) b;
create index on t2 (a,b,c);

analyze t1;
analyze t2;

-- First 'Execution Time' line is this patched version (0001+0002+0003) (without including 0003, the non-DISTINCT queries would be equal to master)
-- Second 'Execution Time' line is master
-- Third 'Execution Time' is previous skip scan patch version
-- Just ran a couple of times to give an indication
-- on order of magnitude, not a full benchmark.
select distinct on (a) * from t1;
Execution Time: 1.407 ms (current patch)
Execution Time: 480.704 ms (master)
Execution Time: 1.711 ms (previous patch)

select distinct on (a) * from t1 where b > 50;
Execution Time: 1.432 ms
Execution Time: 481.530 ms
Execution Time: 481.206 ms

select distinct on (a) * from t1 where b > 9990;
Execution Time: 1.074 ms
Execution Time: 33.937 ms
Execution Time: 33.115 ms

select distinct on (a) * from t1 where d > 0.5;
Execution Time: 0.811 ms
Execution Time: 446.549 ms
Execution Time: 436.091 ms

select * from t1 where b=50;
Execution Time: 1.111 ms
Execution Time: 33.242 ms
Execution Time: 36.555 ms

select * from t1 where b between 50 and 75 and d > 0.5;
Execution Time: 2.370 ms
Execution Time: 60.744 ms
Execution Time: 62.820 ms

select * from t1 where b in (100, 200);
Execution Time: 2.464 ms
Execution Time: 252.224 ms
Execution Time: 244.872 ms

select * from t1 where b in (select distinct a from t1);
Execution Time: 91.000 ms
Execution Time: 842.969 ms
Execution Time: 386.871 ms

select distinct on (a) * from t2;
Execution Time: 47.155 ms
Execution Time: 714.102 ms
Execution Time: 56.327 ms

select distinct on (a) * from t2 where b > 5;
Execution Time: 60.100 ms
Execution Time: 709.960 ms
Execution Time: 727.949 ms

select distinct on (a) * from t2 where b > 95;
Execution Time: 55.420 ms
Execution Time: 71.007 ms
Execution Time: 69.229 ms

select distinct on (a) * from t2 where d > 0.5;
Execution Time: 49.254 ms
Execution Time: 719.820 ms
Execution Time: 705.991 ms

-- slight performance degradation here compared to regular index scan
-- due to data unfavorable data distribution
select * from t2 where b=50;
Execution Time: 47.603 ms
Execution Time: 37.327 ms
Execution Time: 40.448 ms

select * from t2 where b between 50 and 75 and d > 0.5;
Execution Time: 244.546 ms
Execution Time: 228.579 ms
Execution Time: 227.541 ms

select * from t2 where b in (100, 200);
Execution Time: 64.021 ms
Execution Time: 242.905 ms
Execution Time: 258.864 ms

select * from t2 where b in (select distinct a from t2);
Execution Time: 758.350 ms
Execution Time: 1271.230 ms
Execution Time: 761.311 ms

I wrote a few things here about the method as well:
https://github.com/fvannee/postgres/wiki/Index-Skip-Scan
Code can be found there on Github as well in branch 'skip-scan'

Attachment Content-Type Size
0001-Unique-key.patch application/octet-stream 25.3 KB
0002-Index-skip-scan.patch application/octet-stream 213.6 KB
0003-Make-planner-favor-skip-in-index-scans-and-modify-te.patch application/octet-stream 168.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2020-03-21 22:13:03 Re: Ecpg dependency
Previous Message Tomas Vondra 2020-03-21 21:59:46 Re: Additional improvements to extended statistics