Re: Making all nbtree entries unique by having heap TIDs participate in comparisons

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Claudio Freire <klaussfreire(at)gmail(dot)com>, Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>, "Andrey V(dot) Lepikhov" <a(dot)lepikhov(at)postgrespro(dot)ru>
Subject: Re: Making all nbtree entries unique by having heap TIDs participate in comparisons
Date: 2019-03-08 04:22:54
Message-ID: CAH2-Wzk42aYVO4fOGzpt_O9wdC3=iw7Fwo8MXFuQi8fOWH290g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Mar 6, 2019 at 10:15 PM Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:
> I came up with the attached (against master), which addresses the 2nd
> and 3rd points. I added a whole new BTInsertStateData struct, to hold
> the binary search bounds. BTScanInsert now only holds the 'scankeys'
> array, and the 'nextkey' flag. The new BTInsertStateData struct also
> holds the current buffer we're considering to insert to, and a
> 'bounds_valid' flag to indicate if the saved bounds are valid for the
> current buffer. That way, it's more straightforward to clear the
> 'bounds_valid' flag whenever we move right.
>
> I made a copy of the _bt_binsrch, _bt_binsrch_insert. It does the binary
> search like _bt_binsrch does, but the bounds caching is only done in
> _bt_binsrch_insert. Seems more clear to have separate functions for them
> now, even though there's some duplication.

Attached is v15, which does not yet integrate these changes. However,
it does integrate earlier feedback that you posted for v14. I also
cleaned up some comments within nbtsplitloc.c.

I would like to work through these other items with you
(_bt_binsrch_insert() and so on), but I think that it would be helpful
if you made an effort to understand the minusinfkey stuff first. I
spent a lot of time improving the explanation of that within
_bt_compare(). It's important.

The !minusinfkey optimization is more than just a "nice to have".
Suffix truncation makes pivot tuples less restrictive about what can
go on each page, but that might actually hurt performance if we're not
also careful to descend directly to the leaf page where matches will
first appear (rather than descending to a page to its left). If we
needlessly descend to a page that's to the left of the leaf page we
really ought to go straight to, then there are cases that are
regressed rather than helped -- especially cases where splits use the
"many duplicates" strategy. You continue to get correct answers when
the !minusinfkey optimization is ripped out, but it seems almost
essential that we include it. While it's true that we've always had to
move left like this, it's also true that suffix truncation will make
it happen much more frequently. It would happen (without the
!minusinfkey optimization) most often where suffix truncation makes
pivot tuples smallest.

Once you grok the minusinfkey stuff, then we'll be in a better
position to work through the feedback about _bt_binsrch_insert() and
so on, I think. You may lack all of the context of how the second
patch goes on to use the new insertion scan key struct, so it will
probably make life easier if we're both on the same page. (Pun very
much intended.)

Thanks again!
--
Peter Geoghegan

Attachment Content-Type Size
v15-0001-Refactor-nbtree-insertion-scankeys.patch application/octet-stream 57.9 KB
v15-0003-Consider-secondary-factors-during-nbtree-splits.patch application/octet-stream 49.3 KB
v15-0005-Add-split-after-new-tuple-optimization.patch application/octet-stream 11.9 KB
v15-0004-Allow-tuples-to-be-relocated-from-root-by-amchec.patch application/octet-stream 16.9 KB
v15-0002-Make-heap-TID-a-tie-breaker-nbtree-index-column.patch application/octet-stream 156.2 KB
v15-0006-Add-high-key-continuescan-optimization.patch application/octet-stream 8.6 KB
v15-0007-DEBUG-Add-pageinspect-instrumentation.patch application/octet-stream 7.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ideriha, Takeshi 2019-03-08 04:40:18 RE: Protect syscache from bloating with negative cache entries
Previous Message Chapman Flack 2019-03-08 04:04:07 Re: House style for DocBook documentation?