RE: Index Skip Scan (new UniqueKeys)

From: Floris Van Nee <florisvannee(at)Optiver(dot)com>
To: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>
Cc: Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Jesper Pedersen <jesper(dot)pedersen(at)redhat(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, "Dilip Kumar" <dilipbalaut(at)gmail(dot)com>
Subject: RE: Index Skip Scan (new UniqueKeys)
Date: 2020-07-12 12:48:47
Message-ID: ef9954d83a9e42fabfac235bdd87d05a@opammb0561.comp.optiver.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>
> Good point, thanks for looking at this. With the latest planner version there
> are indeed more possibilities to use skipping. It never occured to me that
> some of those paths will still rely on index scan returning full data set. I'll look
> in details and add verification to prevent putting something like this on top of
> skip scan in the next version.

I believe the required changes are something like in attached patch. There were a few things I've changed:
- build_uniquekeys was constructing the list incorrectly. For a DISTINCT a,b, it would create two unique keys, one with a and one with b. However, it should be one unique key with (a,b).
- the uniquekeys that is built, still contains some redundant keys, that are normally eliminated from the path keys lists.
- the distinct_pathkeys may be NULL, even though there's a possibility for skipping. But it wouldn't create the uniquekeys in this case. This makes the planner not choose skip scans even though it could. For example in queries that do SELECT DISTINCT ON (a) * FROM t1 WHERE a=1 ORDER BY a,b; Since a is constant, it's eliminated from regular pathkeys.
- to combat the issues mentioned earlier, there's now a check in build_index_paths that checks if the query_pathkeys matches the useful_pathkeys. Note that we have to use the path keys here rather than any of the unique keys. The unique keys are only Expr nodes - they do not contain the necessary information about ordering. Due to elimination of some constant path keys, we have to search the attributes of the index to find the correct prefix to use in skipping.
- creating the skip scan path did not actually fill the Path's unique keys. It should just set this to query_uniquekeys.

I've attached the first two unique-keys patches (v9, 0001, 0002)), your patches, but rebased on v9 of unique keys (0003-0006) + a diff patch (0007) that applies my suggested changes on top of it.

-Floris

Attachment Content-Type Size
0001-Introduce-RelOptInfo-notnullattrs-attribute.patch application/octet-stream 4.8 KB
0002-Introduce-UniqueKey-attributes-on-RelOptInfo-struct.patch application/octet-stream 58.6 KB
0003-Extend-UniqueKeys.patch application/octet-stream 13.0 KB
0004-Index-skip-scan.patch application/octet-stream 38.4 KB
0005-Btree-implementation-of-skipping.patch application/octet-stream 40.0 KB
0006-Index-skip-scan-documentation.patch application/octet-stream 4.6 KB
0007-planner-fixes.patch application/octet-stream 11.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message vignesh C 2020-07-12 13:13:31 Re: [PATCH] Performance Improvement For Copy From Binary Files
Previous Message Michael Paquier 2020-07-12 12:42:26 Re: A patch for get origin from commit_ts.