Re: Index Skip Scan

From: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc: Jesper Pedersen <jesper(dot)pedersen(at)redhat(dot)com>, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, Floris Van Nee <florisvannee(at)optiver(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, James Coleman <jtc331(at)gmail(dot)com>, Rafia Sabih <rafia(dot)pghackers(at)gmail(dot)com>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Bhushan Uparkar <bhushan(dot)uparkar(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
Subject: Re: Index Skip Scan
Date: 2019-07-28 19:17:14
Message-ID: CA+q6zcWC_TLO8fj_ad4pADLpr7UFjzRfXcMJ8gQsWtvT1Ky+9w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On Thu, Jul 25, 2019 at 1:21 PM Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> wrote:
>
> I have some comments.

Thank you for the review!

> + * The order of columns in the index should be the same, as for
> + * unique distincs pathkeys, otherwise we cannot use _bt_search
> + * in the skip implementation - this can lead to a missing
> + * records.
>
> It seems that it is enough that distinct pathkeys is contained in
> index pathkeys. If it's right, that is almost checked in existing
> code:

Looks like you're right. After looking closely there seems to be an issue in
the original implementation, when we use wrong prefix_size in such cases.
Without this problem this condition is indeed enough.

> > if (enable_indexskipscan &&
> > IsA(path, IndexPath) &&
> > ((IndexPath *) path)->indexinfo->amcanskip &&
> > (path->pathtype == T_IndexOnlyScan ||
> > path->pathtype == T_IndexScan) &&
> > (needed_pathkeys == root->distinct_pathkeys ||
> > pathkeys_contained_in(root->distinct_pathkeys,
> > path->pathkeys)))
>
> path->pathtype is always one of T_IndexOnlyScan or T_IndexScan so
> the check against them isn't needed. If you have concern on that,
> we can add that as Assert().
>
> + if (path->pathtype == T_IndexScan &&
> + parse->jointree != NULL &&
> + parse->jointree->quals != NULL &&
> + ((List *)parse->jointree->quals)->length != 0)
>
> It's better to use list_length instead of peeping inside. It
> handles the NULL case as well. (The structure has recently
> changed but .length is not, though.)

Yeah, will change both (hopefully soon)

> + /*
> + * XXX: In case of index scan quals evaluation happens after
> + * ExecScanFetch, which means skip results could be fitered out
> + */
>
> Why can't we use skipscan path if having filter condition? If
> something bad happens, the reason must be written here instead of
> what we do.

Sorry, looks like I've failed to express this more clear in the commentary. The
point is that when an index scan (not for index only scan) has some conditions,
their evaluation happens after skipping, and I don't see any not too much
invasive way to apply skip correctly.

>
> + * If advancing direction is different from index direction, we must
> + * skip right away, but _bt_skip requires a starting point.
>
> It doesn't seem needed to me. Could you elaborate on the reason
> for that?

This is needed for e.g. scan with a cursor backward without an index condition.
E.g. if we have:

1 1 2 2 3 3
1 2 3 4 5 6

and do

DECLARE c SCROLL CURSOR FOR
SELECT DISTINCT ON (a) a,b FROM ab ORDER BY a, b;

FETCH ALL FROM c;

we should get

1 2 3
1 3 5

When afterwards we do

FETCH BACKWARD ALL FROM c;

we should get

3 2 1
5 2 1

If we will use _bt_next first time without _bt_skip, the first pair would be
3 6 (the first from the end of the tuples, not from the end of the cursor).

> + * If advancing direction is different from index direction, we must
> + * skip right away, but _bt_skip requires a starting point.
> + */
> + if (direction * indexonlyscan->indexorderdir < 0 &&
> + !node->ioss_FirstTupleEmitted)
>
> I'm confused by this. "direction" there is the physical scan
> direction (fwd/bwd) of index scan, which is already compensated
> by indexorderdir. Thus the condition means we do that when
> logical ordering (ASC/DESC) is DESC. (Though I'm not sure what
> "index direction" exactly means...)

I'm not sure I follow, what do you mean by compensated? In general you're
right, as David Rowley mentioned above, indexorderdir is a general scan
direction, and direction is flipped estate->es_direction, which is a cursor
direction. The goal of this condition is catch when those two are different,
and we need to advance and read in different directions.

In response to

Responses

  • at 2019-07-29 08:31:20 from Kyotaro Horiguchi
  • Re: Index Skip Scan at 2019-08-02 12:14:06 from Dmitry Dolgov

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2019-07-28 19:21:51 Re: ANALYZE: ERROR: tuple already updated by self
Previous Message Tom Lane 2019-07-28 18:31:28 Re: how to run encoding-dependent tests by default