Re: Index Skip Scan

From: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>
To: Jesper Pedersen <jesper(dot)pedersen(at)redhat(dot)com>
Cc: Rafia Sabih <rafia(dot)pghackers(at)gmail(dot)com>, Floris Van Nee <florisvannee(at)optiver(dot)com>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Alexander Kuzmenkov <a(dot)kuzmenkov(at)postgrespro(dot)ru>, Peter Geoghegan <pg(at)bowt(dot)ie>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Bhushan Uparkar <bhushan(dot)uparkar(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, James Coleman <jtc331(at)gmail(dot)com>
Subject: Re: Index Skip Scan
Date: 2019-06-03 20:31:33
Message-ID: CA+q6zcVnRRqktw4cg2Vddy3S+VY3Tsmit0kausYtctvsTum7tA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On Sat, Jun 1, 2019 at 6:57 PM Dmitry Dolgov <9erthalion6(at)gmail(dot)com> wrote:
>
> > On Sat, Jun 1, 2019 at 5:34 PM Floris Van Nee <florisvannee(at)optiver(dot)com> wrote:
> >
> > I did a little bit of investigation and it seems to occur because in
> > pathkeys.c the function pathkey_is_redundant considers pathkeys redundant if
> > there is an equality condition with a constant in the corresponding WHERE
> > clause.
> > ...
> > However, the index skip scan interprets this as that it has to skip over just
> > the first column.
>
> Right, passing correct number of columns fixes this particular problem. But
> while debugging I've also discovered another related issue, when the current
> implementation seems to have a few assumptions, that are not correct if we have
> an index condition and a distinct column is not the first in the index. I'll
> try to address these in a next version of the patch in the nearest future.

So, as mentioned above, there were a few problems, namely the number of
distinct_pathkeys with and without redundancy, and using _bt_search when the
order of distinct columns doesn't match the index. As far as I can see the
problem in the latter case (when we have an index condition) is that it's still
possible to find a value, but lastItem value after the search is always zero
(due to _bt_checkkeys filtering) and _bt_next stops right away.

To address this, probably we can do something like in the attached patch.
Altogether with distinct_pathkeys uniq_distinct_pathkeys are stored, which is
the same, but without the constants elimination. It's being used then for
getting the real number of distinct keys, and to check the order of the columns
to not consider index skip scan if it's different. Hope it doesn't
look too hacky.

Also I've noticed, that the current implementation wouldn't work e.g. for:

select distinct a, a from table;

because in this case an IndexPath is hidden behind a ProjectionPath. For now I
guess it's fine, but probably it's possible here to apply skip scan too.

Attachment Content-Type Size
v17-0001-Index-skip-scan.patch application/octet-stream 50.8 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Konstantin Knizhnik 2019-06-03 20:37:30 Re: Pinned files at Windows
Previous Message Tom Lane 2019-06-03 20:07:28 Re: Question about some changes in 11.3