Generic Index Skip Scan

From: Floris Van Nee <florisvannee(at)Optiver(dot)com>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Cc: Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com>, Jesper Pedersen <jesper(dot)pedersen(at)redhat(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, "Kyotaro Horiguchi" <horikyota(dot)ntt(at)gmail(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, "Thomas Munro" <thomas(dot)munro(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Subject: Generic Index Skip Scan
Date: 2020-07-15 19:52:02
Message-ID: e4b623692a1447d4a13ac80fa271c8e6@opammb0561.comp.optiver.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Besides the great efforts that Dmitry et al. are putting into the skip scan for DISTINCT queries [1], I'm also still keen on extending the use of it further. I'd like to address the limited cases in which skipping can occur here. A few months ago I shared an initial rough patch that provided a generic skip implementation, but lacked the proper planning work [2]. I'd like to share a second patch set that provides an implementation of the planner as well. Perhaps this can lead to some proper discussions how we'd like to shape this patch further.

Please see [2] for an introduction and some rough performance comparisons. This patch improves upon those, because it implements proper cost estimation logic. It will now only choose the skip scan if it's deemed to be cheaper than using a regular index scan. Other than that, all the features are still there. The skip scan can be used in many more types of queries than in the original DISTINCT patch as provided in [1], making it more performant and also more predictable for users.

I'm keen on receiving feedback on this idea and on the patch. I believe it could be a great feature that is useful to many users. However, when I posted the previous version of the patch, only Thomas expressed his explicit interest in the feature. It would be useful for me to know if there's enough interest here. Please speak out as well if you can't (currently) review, but do think that this feature is worth the effort.

I'm sure there are still plenty of things that need to be improved. I have some in mind, but at the moment it's hard for me to judge which ones are really important and which ones are not. I think I really need someone with more experience of the code looking at this for feedback.

v9-0001 + v9-0002 are Andy's UniqueKeys patches [3]
v01-0001 is a slightly modified version of Dmitry's extension of unique keys patch (his lastest patch plus the diff patch that I posted in the original index skip thread)
v01-0002 is the bulk of the work: the skip implementation, indexam interface and implementation for DISTINCT queries
v01-0003 is the additional planner work to add support for skipping in regular index scans (non-DISTINCT)

-Floris

[1] https://www.postgresql.org/message-id/flat/20200609102247(dot)jdlatmfyeecg52fi(at)localhost
[2] https://www.postgresql.org/message-id/c5c5c974714a47f1b226c857699e8680%40opammb0561.comp.optiver.com
[3] https://www.postgresql.org/message-id/flat/CAKU4AWrwZMAL=uaFUDMf4WGOVkEL3ONbatqju9nSXTUucpp_pw(at)mail(dot)gmail(dot)com

Attachment Content-Type Size
v9-0001-Introduce-RelOptInfo-notnullattrs-attribute.patch application/octet-stream 4.8 KB
v9-0002-Introduce-UniqueKey-attributes-on-RelOptInfo-stru.patch application/octet-stream 58.6 KB
v01-0001-Extend-UniqueKeys.patch application/octet-stream 13.0 KB
v01-0002-Index-skip-scan.patch application/octet-stream 218.2 KB
v01-0003-Support-skip-scan-for-non-distinct-scans.patch application/octet-stream 18.7 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2020-07-15 19:52:03 Dependencies for partitioned indexes are still a mess
Previous Message Tom Lane 2020-07-15 19:29:26 Re: Dumping/restoring fails on inherited generated column