Quick Links

Re: Index Skip Scan

From:	Dmitry Dolgov <9erthalion6(at)gmail(dot)com>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Jesper Pedersen <jesper(dot)pedersen(at)redhat(dot)com>, a(dot)kuzmenkov(at)postgrespro(dot)ru, pg(at)bowt(dot)ie, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Bhushan Uparkar <bhushan(dot)uparkar(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
Subject:	Re: Index Skip Scan
Date:	2018-10-16 19:22:18
Message-ID:	CA+q6zcV5XmtTFRVKdkK6nhd4Xf4D-SzzLHjvkkXFdnQJBV-pNw@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

> On Fri, 12 Oct 2018 at 19:44, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Thu, Sep 13, 2018 at 11:40 AM Jesper Pedersen
> <jesper(dot)pedersen(at)redhat(dot)com> wrote:
> > > I noticed that current implementation doesn't
> > > perform well when we have lots of small groups of equal values. Here is
> > > the execution time of index skip scan vs unique over index scan, in ms,
> > > depending on the size of group. The benchmark script is attached.
> > >
> > > group size skip unique
> > > 1 2,293.85 132.55
> > > 5 464.40 106.59
> > > 10 239.61 102.02
> > > 50 56.59 98.74
> > > 100 32.56 103.04
> > > 500 6.08 97.09
> > >
> >
> > Yes, this doesn't look good. Using your test case I'm seeing that unique
> > is being chosen when the group size is below 34, and skip above.
>
> I'm not sure exactly how the current patch is approaching the problem,
> but it seems like it might be reasonable to do something like -- look
> for a distinct item within the current page; if not found, then search
> from the root of the tree for the next item > the current item.

I'm not sure that I understand it correctly, can you elaborate please? From
what I see it's quite similar to what's already implemented - we look for a
distinct item on the page, and then search the index tree for a next distinct
item.

> On Wed, 10 Oct 2018 at 17:34, Dmitry Dolgov <9erthalion6(at)gmail(dot)com> wrote:
> > On Tue, 9 Oct 2018 at 18:13, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
> >
> >> It looks like good idea, but then the node should be named "index scan" and
> >> other info can be displayed in detail parts. It can be similar like "sort".
> >> The combination of unique and index skip scan looks strange :)
> >
> > maybe we don't need special index skip scan node - maybe possibility to
> > return unique values from index scan node can be good enough - some like
> > "distinct index scan" - and the implementation there can be dynamic -skip
> > scan, classic index scan,
> >
> > "index skip scan" is not good name if the implementaion is dynamic.
>
> Yeah, that's a valid point. The good part is that index skip scan is not really
> a separate node, but just enhanced index only scan node. So indeed maybe it
> would be better to call it Index Only Scan, but show in details that we apply
> the skip scan strategy. Any other opinions about this?

To make it more clean what I mean, see attached version of the patch.

> >> I think it was query like
> >> select count(*) from (select distinct x from tab) s
>
> Thanks, I'll take a look.

I couldn't reproduce it either yet.

Attachment	Content-Type	Size
index-skip-fallback-v2.patch	application/octet-stream	31.7 KB

In response to

Re: Index Skip Scan at 2018-10-12 17:43:48 from Robert Haas

Responses

Re: Index Skip Scan at 2018-11-12 12:28:58 from Sergei Kornilov

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Andres Freund	2018-10-16 19:27:33	Re: Large writable variables
Previous Message	Robert Haas	2018-10-16 19:13:43	Re: Large writable variables