Re: Index Skip Scan

From: James Coleman <jtc331(at)gmail(dot)com>
To: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, jesper(dot)pedersen(at)redhat(dot)com, a(dot)kuzmenkov(at)postgrespro(dot)ru, Peter Geoghegan <pg(at)bowt(dot)ie>, thomas(dot)munro(at)enterprisedb(dot)com, Bhushan Uparkar <bhushan(dot)uparkar(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
Subject: Re: Index Skip Scan
Date: 2019-02-01 21:04:58
Message-ID: CAAaqYe_yfALkVzRggdWwxLLB8C=BFSLGCcJZh=ZvYCc8+r2CqA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jan 31, 2019 at 1:32 AM Kyotaro HORIGUCHI
<horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> Also as mentioned upthread by Peter Geoghegan, this could easly
> give worse plan by underestimation. So I also suggest that this
> has dynamic fallback function. In such perspective it is not
> suitable for AM API level feature.
>
> If all leaf pages are on the buffer and the average hopping
> distance is less than (expectedly) 4 pages (the average height of
> the tree), the skip scan will lose. If almost all leaf pages are
> staying on disk, we could win only by 2-pages step (skipping over
> one page).
>
> =====
> As I'm writing the above, I came to think that it's better
> implement this as an pure executor optimization.
>
> Specifically, let _bt_steppage count the ratio of skipped pages
> so far then if the ratio exceeds some threshold (maybe around
> 3/4) it gets into hopscotching mode, where it uses index scan to
> find the next page (rather than traversing). As mentioned above,
> I think using skip scan to go beyond the next page is a good
> bet. If the success ration of skip scans goes below some
> threshold (I'm not sure for now), we should fall back to
> traversing.
>
> Any opinions?

Hi!

I'd like to offer a counterpoint: in cases where this a huge win we
definitely do want this to affect cost estimation, because if it's
purely an executor optimization the index scan path may not be chosen
even when skip scanning would be a dramatic improvement.

I suppose that both requirements could be met by incorporating it into
the existing index scanning code and also modifying to costing to
(only when we have high confidence?) account for the optimization. I'm
not sure if that makes things better than the current state of the
patch or not.

James Coleman

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2019-02-01 22:05:03 Re: Index Skip Scan
Previous Message Robert Haas 2019-02-01 20:31:06 Re: ATTACH/DETACH PARTITION CONCURRENTLY