Quick Links

Re: Index Skip Scan

From:	Peter Geoghegan <pg(at)bowt(dot)ie>
To:	Jesper Pedersen <jesper(dot)pedersen(at)redhat(dot)com>
Cc:	Floris Van Nee <florisvannee(at)optiver(dot)com>, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, James Coleman <jtc331(at)gmail(dot)com>, Rafia Sabih <rafia(dot)pghackers(at)gmail(dot)com>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Bhushan Uparkar <bhushan(dot)uparkar(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
Subject:	Re: Index Skip Scan
Date:	2020-01-21 01:05:33
Message-ID:	CAH2-WznfCX_XQWWVh+FWbB77aNWTa3MnBzGYffMU0TYYsVBOjg@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Mon, Jan 20, 2020 at 1:19 PM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> On Mon, Jan 20, 2020 at 11:01 AM Jesper Pedersen
> <jesper(dot)pedersen(at)redhat(dot)com> wrote:
> > > - nbtsearch.c _bt_skip line 1440
> > > if (BTScanPosIsValid(so->currPos) &&
> > > _bt_scankey_within_page(scan, so->skipScanKey, so->currPos.buf, dir))
> > >
> > > Is it allowed to look at the high key / low key of the page without have a read lock on it?
> > >
> >
> > In case of a split the page will still contain a high key and a low key,
> > so this should be ok.
>
> This is definitely not okay.

I suggest that you find a way to add assertions to code like
_bt_readpage() that verify that we do in fact have the buffer content
lock. Actually, there is an existing assertion here that covers the
pin, but not the buffer content lock:

static bool
_bt_readpage(IndexScanDesc scan, ScanDirection dir, OffsetNumber offnum)
{
<declare variables>
...

/*
* We must have the buffer pinned and locked, but the usual macro can't be
* used here; this function is what makes it good for currPos.
*/
Assert(BufferIsValid(so->currPos.buf));

You can add another assertion that calls a new utility function in
bufmgr.c. That can use the same logic as this existing assertion in
FlushOneBuffer():

Assert(LWLockHeldByMe(BufferDescriptorGetContentLock(bufHdr)));

We haven't needed assertions like this so far because it's usually it
is clear whether or not a buffer lock is held (plus the bufmgr.c
assertions help on their own). The fact that it isn't clear whether or
not a buffer lock will be held by caller here suggests a problem. Even
still, having some guard rails in the form of these assertions could
be helpful. Also, it seems like _bt_scankey_within_page() should have
a similar set of assertions.

BTW, there is a paper that describes optimizations like loose index
scan and skip scan together, in fairly general terms: "Efficient
Search of Multidimensional B-Trees". Loose index scans are given the
name "MDAM duplicate elimination" in the paper. See:

http://vldb.org/conf/1995/P710.PDF

Goetz Graefe told me about the paper. It seems like the closest thing
that exists to a taxonomy or conceptual framework for these
techniques.

--
Peter Geoghegan

In response to

Re: Index Skip Scan at 2020-01-20 21:19:30 from Peter Geoghegan

Responses

Re: Index Skip Scan at 2020-01-21 10:00:08 from Dmitry Dolgov
Re: Index Skip Scan at 2020-01-21 17:06:12 from Jesper Pedersen
Re: Index Skip Scan at 2020-05-13 22:55:47 from Peter Geoghegan

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Thomas Munro	2020-01-21 01:06:08	Re: [HACKERS] kqueue
Previous Message	Tom Lane	2020-01-21 00:51:16	Re: libxml2 is dropping xml2-config