|From:||Dmitry Dolgov <9erthalion6(at)gmail(dot)com>|
|To:||Jesper Pedersen <jesper(dot)pedersen(at)redhat(dot)com>, a(dot)kuzmenkov(at)postgrespro(dot)ru|
|Cc:||pg(at)bowt(dot)ie, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Bhushan Uparkar <bhushan(dot)uparkar(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, jtc331(at)gmail(dot)com|
|Subject:||Re: Index Skip Scan|
|Views:||Raw Message | Whole Thread | Download mbox|
> On Fri, 16 Nov 2018 at 16:06, Jesper Pedersen <jesper(dot)pedersen(at)redhat(dot)com> wrote:
> On 11/15/18 6:41 AM, Alexander Kuzmenkov wrote:
> >>> But having this logic inside _bt_next means that it will make a
> >>> non-skip index
> >>> only scan a bit slower, am I right?
> >> Correct.
> > Well, it depends on how the skip scan is implemented. We don't have to
> > make normal scans slower, because skip scan is just a separate thing.
> > My main point was that current implementation is good as a proof of
> > concept, but it is inefficient for some data and needs some unreliable
> > planner logic to work around this inefficiency. And now we also have
> > execution-time fallback because planning-time fallback isn't good
> > enough. This looks overly complicated. Let's try to design an algorithm
> > that works well regardless of the particular data and doesn't need these
> > heuristics. It should be possible to do so for btree.
> > Of course, I understand the reluctance to implement an entire new type
> > of btree traversal. Anastasia Lubennikova suggested a tweak for the
> > current method that may improve the performance for small groups of
> > equal values. When we search for the next unique key, first check if it
> > is contained on the current btree page using its 'high key'. If it is,
> > find it on the current page. If not, search from the root of the tree
> > like we do now. This should improve the performance for small equal
> > groups, because there are going to be several such groups on the page.
> > And this is exactly where we have the regression compared to unique +
> > index scan.
> Robert suggested something similar in . I'll try and look at that
> when I'm back from my holiday.
Yeah, probably you're right. Unfortunately, I've misunderstood the previous
Robert's message in this thread with the similar approach. Jesper, I hope you
don't mind if I'll post an updated patch? _bt_skip is changed there in the
suggested way, so that it checks the current page before searching from the
root of a tree, and I've removed the fallback logic. After some
initial tests I see
that with this version skip scan over a table with 10^7 rows and 10^6
distinct values is slightly slower than a regular scan, but not that much.
> > By the way, what is the data for which we intend this feature to work?
> > Obviously a non-unique btree index, but how wide are the tuples, and how
> > big the equal groups? It would be good to have some real-world examples.
> Although my primary use-case is int I agree that we should test the
> different data types, and tuple widths.
My personal motivation here is exactly that we face use-cases for skip scan
from time to time. Usually it's quite few distinct values (up to a dozen or
so, which means that equal groups are quite big), but with the variety of types
> On Thu, 15 Nov 2018 at 15:28, James Coleman <jtc331(at)gmail(dot)com> wrote:
> > Is skip scan only possible for index-only scan?
> I haven't see much discussion of this question yet. Is there a
> particular reason to lock ourselves into thinking about this only in
> an index only scan?
I guess, the only reason is to limit the scope of the patch.
|Next Message||Justin Pryzby||2018-11-17 23:37:15||Re: pg11.1 jit segv|
|Previous Message||Tomas Vondra||2018-11-17 23:11:54||Re: valgrind issues on Fedora 28|