Re: Index Skip Scan

From: Jesper Pedersen <jesper(dot)pedersen(at)redhat(dot)com>
To: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>
Cc: a(dot)kuzmenkov(at)postgrespro(dot)ru, pg(at)bowt(dot)ie, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Bhushan Uparkar <bhushan(dot)uparkar(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
Subject: Re: Index Skip Scan
Date: 2018-09-27 13:59:50
Message-ID: cc30955d-2ee9-4c36-e5d7-fc5b552fbbb6@redhat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Dmitry,

On 9/15/18 3:52 PM, Dmitry Dolgov wrote:
>> On Thu, 13 Sep 2018 at 21:36, Alexander Kuzmenkov <a(dot)kuzmenkov(at)postgrespro(dot)ru> wrote:
>> El 13/09/18 a las 18:39, Jesper Pedersen escribió:
>>> I think we can improve this,
>>> and the skip scan can be strictly faster than index scan regardless of
>>> the data. As a first approximation, imagine that we somehow skipped
>>> equal tuples inside _bt_next instead of sending them to the parent
>>> Unique node. This would already be marginally faster than Unique + Index
>>> scan. A more practical implementation would be to remember our position
>>> in tree (that is, BTStack returned by _bt_search) and use it to skip
>>> pages in bulk. This looks straightforward to implement for a tree that
>>> does not change, but I'm not sure how to make it work with concurrent
>>> modifications. Still, this looks a worthwhile direction to me, because
>>> if we have a strictly faster skip scan, we can just use it always and
>>> not worry about our unreliable statistics. What do you think?
>>>
>>
>> This is something to look at -- maybe there is a way to use
>> btpo_next/btpo_prev instead/too in order to speed things up. Atm we just
>> have the scan key in BTScanOpaqueData. I'll take a look after my
>> upcoming vacation; feel free to contribute those changes in the meantime
>> of course.
>
> But having this logic inside _bt_next means that it will make a non-skip index
> only scan a bit slower, am I right?

Correct.

> Probably it would be easier and more
> straightforward to go with the idea of dynamic fallback then. The first naive
> implementation that I came up with is to wrap an index scan node into a unique,
> and remember estimated number of groups into IndexOnlyScanState, so that we can
> check if we performed much more skips than expected. With this approach index
> skip scan will work a bit slower than in the original patch in case if
> ndistinct is correct (because a unique node will recheck rows we returned), and
> fallback to unique + index only scan in case if planner has underestimated
> ndistinct.
>

I think we need a comment on this in the patch, as 10 *
node->ioss_PlanRows looks a bit random.

Thanks for your contribution !

Best regards,
Jesper

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jesper Pedersen 2018-09-27 14:03:51 Re: pread() and pwrite()
Previous Message Dean Rasheed 2018-09-27 13:22:39 Re: BUG #15307: Low numerical precision of (Co-) Variance