Re: Index Skip Scan

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>
Cc: Floris Van Nee <florisvannee(at)optiver(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, Jesper Pedersen <jesper(dot)pedersen(at)redhat(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, James Coleman <jtc331(at)gmail(dot)com>, Rafia Sabih <rafia(dot)pghackers(at)gmail(dot)com>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, "bhushan(dot)uparkar(at)gmail(dot)com" <bhushan(dot)uparkar(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
Subject: Re: Index Skip Scan
Date: 2020-05-13 09:07:21
Message-ID: CAFiTN-u-=aMJLQJg=kdiKoQkKtiLBpOHH=0gtJ2FhqPZgx_6Rg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, May 11, 2020 at 4:55 PM Dmitry Dolgov <9erthalion6(at)gmail(dot)com> wrote:
>
> > On Mon, May 11, 2020 at 04:04:00PM +0530, Dilip Kumar wrote:
> >
> > > > +static inline bool
> > > > +_bt_scankey_within_page(IndexScanDesc scan, BTScanInsert key,
> > > > + Buffer buf, ScanDirection dir)
> > > > +{
> > > > + OffsetNumber low, high;
> > > > + Page page = BufferGetPage(buf);
> > > > + BTPageOpaque opaque = (BTPageOpaque) PageGetSpecialPointer(page);
> > > > +
> > > > + low = P_FIRSTDATAKEY(opaque);
> > > > + high = PageGetMaxOffsetNumber(page);
> > > > +
> > > > + if (unlikely(high < low))
> > > > + return false;
> > > > +
> > > > + return (_bt_compare(scan->indexRelation, key, page, low) > 0 &&
> > > > + _bt_compare(scan->indexRelation, key, page, high) < 1);
> > > > +}
> > > >
> > > > I think the high key condition should be changed to
> > > > _bt_compare(scan->indexRelation, key, page, high) < 0 ? Because if
> > > > prefix qual is equal to the high key then also there is no point in
> > > > searching on the current page so we can directly skip.
> > >
> > > From nbtree/README and comments to functions like _bt_split I've got an
> > > impression that the high key could be equal to the last item on the leaf
> > > page, so there is a point in searching. Is that incorrect?
> >
> > But IIUC, here we want to decide whether we will get the next key in
> > the current page or not?
>
> In general this function does what it says, it checks wether or not the
> provided scankey could be found within the page. All the logic about
> finding a proper next key to fetch is implemented on the call site, and
> within this function we want to test whatever was passed in. Does it
> answer the question?

Ok, I agree that the function is doing what it is expected to do.
But, then I have a problem with this call site.

+ /* Check if the next unique key can be found within the current page.
+ * Since we do not lock the current page between jumps, it's possible
+ * that it was splitted since the last time we saw it. This is fine in
+ * case of scanning forward, since page split to the right and we are
+ * still on the left most page. In case of scanning backwards it's
+ * possible to loose some pages and we need to remember the previous
+ * page, and then follow the right link from the current page until we
+ * find the original one.
+ *
+ * Since the whole idea of checking the current page is to protect
+ * ourselves and make more performant statistic mismatch case when
+ * there are too many distinct values for jumping, it's not clear if
+ * the complexity of this solution in case of backward scan is
+ * justified, so for now just avoid it.
+ */
+ if (BufferIsValid(so->currPos.buf) && ScanDirectionIsForward(dir))
+ {
+ LockBuffer(so->currPos.buf, BT_READ);
+
+ if (_bt_scankey_within_page(scan, so->skipScanKey, so->currPos.buf, dir))
+ {

Here we expect whether the "next" unique key can be found on this page
or not, but we are using the function which suggested whether the
"current" key can be found on this page or not. I think in boundary
cases where the high key is equal to the current key, this function
will return true (which is expected from this function), and based on
that we will simply scan the current page and IMHO that cost could be
avoided no?

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro Horiguchi 2020-05-13 10:15:58 Re: Add "-Wimplicit-fallthrough" to default flags
Previous Message movead.li@highgo.ca 2020-05-13 08:29:19 Re: A patch for get origin from commit_ts.