Re: Index Skip Scan

From: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>
To: Floris Van Nee <florisvannee(at)optiver(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, Jesper Pedersen <jesper(dot)pedersen(at)redhat(dot)com>, Rafia Sabih <rafia(dot)pghackers(at)gmail(dot)com>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Alexander Kuzmenkov <a(dot)kuzmenkov(at)postgrespro(dot)ru>, Peter Geoghegan <pg(at)bowt(dot)ie>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Bhushan Uparkar <bhushan(dot)uparkar(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, James Coleman <jtc331(at)gmail(dot)com>
Subject: Re: Index Skip Scan
Date: 2019-06-24 11:44:14
Message-ID: CA+q6zcWDqDv7rXfm2d9Ah8z51qQgLK0-q_eLg8vLsdvb4RFf0Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On Sun, Jun 23, 2019 at 1:04 PM Floris Van Nee <florisvannee(at)optiver(dot)com> wrote:
>
> However, _bt_readpage messes things up, because it only reads tuples that
> match all the provided keys (so where b=2)

Right, the problem you've reported first had a similar origins. I'm starting to
think that probably using _bt_readpage just like that is not exactly right
thing to do, since the correct element is already found and there is no need to
check if tuples are matching after one step back. I'll try to avoid it in the
next version of patch.

> I was wondering about something else: don't we also have another problem with
> updating this current index tuple by skipping before calling
> btgettuple/_bt_next? I see there's some code in btgettuple to kill dead tuples
> when scan->kill_prior_tuple is true. I'm not too familiar with the concept of
> killing dead tuples while doing index scans, but by looking at the code it
> seems to be possible that btgettuple returns a tuple, caller processes it and
> sets kill_prior_tuple to true in order to have it killed. However, then the
> skip scan kicks in, which sets the current tuple to a completely different
> tuple. Then, on the next call of btgettuple, the wrong tuple gets killed. Is my
> reasoning correct here or am I missing something?

Need to check, but probably we can avoid that by setting kill_prior_tuple to
false in case of skip scan as in index_rescan.

> On Sun, Jun 23, 2019 at 3:10 PM Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
>
> I've done some initial review on v20 - just reading through the code, no
> tests at this point. Here are my comments:

Thank you!

> 2) indices.sgml
>
> The new section is somewhat unclear and difficult to understand, I think
> it'd deserve a rewording. Also, I wonder if we really should link to the
> wiki page about FSM problems. We have a couple of wiki links in the sgml
> docs, but those seem more generic while this seems as a development page
> that might disapper. But more importantly, that wiki page does not say
> anything about "Loose Index scans" so is it even the right wiki page?

Wow, indeed, looks like it's a totally wrong reference. I think Kyotaro already
mentioned it too, so probably I'm going to remove it (and instead describe the
idea in a few words in the documentation itself).

> 6) nodeIndexScan.c
>
> I wonder why we even add and initialize the ioss_ fields for IndexScan
> nodes, when the skip scans require index-only scans?

Skip scans required index-only scans until recently, when the patch was updated
to incorporate the same approach for index scans too. My apologies, looks like
documentation and some commentaries are still inconsistent about this topic.

> 7) pathnode.c
>
> I wonder how much was the costing discussed. It seems to me the logic is
> fairly similar to ideas discussed in the incremental sort patch, and
> we've been discussing some weak points there. I'm not sure how much we
> need to consider those issues here.

Can you please elaborate in a few words, which issues do you mean? Is it about
non uniform distribution of distinct values? If so, I believe it's partially
addressed when we have to skip too often, by searching a next index page.
Although yeah, there is still an assumption about uniform distribution of
distinct groups at the planning time.

> 9) pathnodes.h
>
> I don't understand what the uniq_distinct_pathkeys comment says :-(

Yeah, sorry, I'll try to improve the commentaries in the next version, where
I'm going to address all the feedback.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2019-06-24 12:52:00 Re: fix psql \conninfo & \connect when using hostaddr
Previous Message Andrey Borodin 2019-06-24 11:31:07 Re: GiST "choose subtree" support function to inline penalty