Re: Index Skip Scan (new UniqueKeys)

From: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Jesper Pedersen <jesper(dot)pedersen(at)redhat(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Floris Van Nee <florisvannee(at)optiver(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Subject: Re: Index Skip Scan (new UniqueKeys)
Date: 2020-10-06 15:20:39
Message-ID: 20201006152039.xs2ul7il52sowhlu@localhost
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On Mon, Sep 21, 2020 at 05:59:32PM -0700, Peter Geoghegan wrote:
>
> * I see the following compiler warning:
>
> /code/postgresql/patch/build/../source/src/backend/optimizer/path/uniquekeys.c:
> In function ‘populate_baserel_uniquekeys’:
> /code/postgresql/patch/build/../source/src/backend/optimizer/path/uniquekeys.c:797:13:
> warning: ‘expr’ may be used uninitialized in this function
> [-Wmaybe-uninitialized]
> 797 | else if (!list_member(unique_index->rel->reltarget->exprs, expr))
> | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

This is mostly for UniqueKeys patch, which is attached here only as a
dependency, but I'll prepare changes for that. Interesting enough I
can't reproduce this warning, but if I understand correctly gcc has some
history of spurious uninitialized warnings, so I guess it could be
version dependent.

> * Perhaps the warning is related to this nearby code that I noticed
> Valgrind complains about:
>
> ==1083468== VALGRINDERROR-BEGIN
> ==1083468== Invalid read of size 4
> ==1083468== at 0x59568A: get_exprs_from_uniqueindex (uniquekeys.c:771)
> ==1083468== by 0x593C5B: populate_baserel_uniquekeys (uniquekeys.c:140)

This also belongs to UniqueKeys patch, but at least I can reproduce this
one. My guess is that nkeycolums should be used there, not ncolums,
which is visible in index_incuding tests. The same as previous one, will
prepare corresponding changes.

> * Do we really need the AM-level boolean flag/argument named
> "scanstart"? Why not just follow the example of btgettuple(), which
> determines whether or not the scan has been initialized based on the
> current scan position?
>
> Just because you set so->currPos.buf to InvalidBuffer doesn't mean you
> cannot or should not take the same approach as btgettuple(). And even
> if you can't take exactly the same approach, I would still think that
> the scan's opaque B-Tree state should remember if it's the first call
> to _bt_skip() (rather than some subsequent call) in some other way
> (e.g. carrying a "scanstart" bool flag directly).

Yes, agree, carrying this flag inside the opaque state would be better.

> * Why is it okay to do anything important based on the
> _bt_scankey_within_page() return value?
>
> If the page is empty, then how can we know that it's okay to go to the
> next value? I'm concerned that there could be subtle bugs in this
> area. VACUUM will usually just delete the empty page. But it won't
> always do so, for a variety of reasons that aren't worth going into
> now. This could mask bugs in this area. I'm concerned about patterns
> like this one from _bt_skip():
>
> while (!nextFound)
> {
> ....
>
> if (_bt_scankey_within_page(scan, so->skipScanKey,
> so->currPos.buf, dir))
> {
> ...
> }
> else
> /*
> * If startItup could be not found within the current page,
> * assume we found something new
> */
> nextFound = true;
> ....
> }
>
> Why would you assume that "we found something new" here? In general I
> just don't understand the design of _bt_skip(). I get the basic idea
> of what you're trying to do, but it could really use better comments.

Yeah, I'll put more efforts into clear comments. There are two different
ways in which _bt_scankey_within_page is being used.

The first one is to check if it's possible to skip traversal of the tree
from root in case if what we're looking for could be on the current
page. In this case an empty page would mean we need to search from the
root, so not sure what could be the issue here?

The second one (that you've highlighted above) I admit is probably the
most questionable part of the patch and open for suggestions how to
improve it. It's required for one particular case with a cursor when
scan advances forward but reads backward. What could happen here is we
found one valid item, but the next one e.g. do not pass scan key
conditions, and we end up with the previous item again. I'm not entirely
sure how presence of an empty page could change this scenario, could you
please show an example?

> *The "jump one more time if it's the same as at the beginning" thing
> seems scary to me. Maybe you should be doing something with the actual
> high key here.

Same as for the previous question, can you give a hint what do you mean
by "doing something with the actual high key"?

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2020-10-06 15:22:11 Re: Prevent printing "next step instructions" in initdb and pg_upgrade
Previous Message Tom Lane 2020-10-06 15:06:13 Re: Prevent printing "next step instructions" in initdb and pg_upgrade