Re: Making all nbtree entries unique by having heap TIDs participate in comparisons

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Claudio Freire <klaussfreire(at)gmail(dot)com>, Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>, "Andrey V(dot) Lepikhov" <a(dot)lepikhov(at)postgrespro(dot)ru>
Subject: Re: Making all nbtree entries unique by having heap TIDs participate in comparisons
Date: 2019-03-16 04:16:23
Message-ID: CAH2-Wznizred1To2x7uzkRD8ZYOpPbzYaoU5aG_g_UJCsiWAFQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Mar 14, 2019 at 2:21 AM Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:
> It doesn't matter how often it happens, the code still needs to deal
> with it. So let's try to make it as readable as possible.

> Well, IMHO holding the buffer and the bounds in the new struct is more
> clean than the savebinsrc/restorebinsrch flags. That's exactly why I
> suggested it. I don't know what else to suggest. I haven't done any
> benchmarking, but I doubt there's any measurable difference.

Fair enough. Attached is v17, which does it using the approach taken
in your earlier prototype. I even came around to your view on
_bt_binsrch_insert() -- I kept that part, too. Note, however, that I
still pass checkingunique to _bt_findinsertloc(), because that's a
distinct condition to whether or not bounds were cached (they happen
to be the same thing right now, but I don't want to assume that).

This revision also integrates your approach to the "continuescan"
optimization patch, with the small tweak I mentioned yesterday (we
also pass ntupatts). I also prefer this approach.

I plan on committing the first few patches early next week, barring
any objections, or any performance problems noticed during an
additional, final round of performance validation. I won't expect
feedback from you until Monday at the earliest. It would be nice if
you could take a look at the amcheck "relocate" patch. My intention is
to push patches up to and including the amcheck "relocate" patch on
the same day (I'll leave a few hours between the first two patches, to
confirm that the first patch doesn't break the buildfarm).

BTW, my multi-day, large BenchmarkSQL benchmark continues, with some
interesting results. The first round of 12 hour long runs showed the
patch nearly 6% ahead in terms of transaction throughput, with a
database that's almost 1 terabyte. The second round, which completed
yesterday and reuses the database initialized for the first round
showed that the patch had 10.7% higher throughput. That's a new record
for the patch. I'm going to leave this benchmark running for a few
more days, at least until it stops being interesting. I wonder how
long it will be before the master branch throughput stops declining
relative to throughput with the patched version. I expect that the
master branch will reach "index bloat saturation point" sooner or
later. Indexes in the patch's data directory continue to get larger,
as expected, but the amount of bloat accumulated over time is barely
noticeable (i.e. the pages are packed tight with tuples, which barely
declines over time).

This version of the patch series has attributions/credits at the end
of the commit messages. I have listed you as a secondary author on a
couple of the patches, where code was lifted from your feedback
patches. Let me know if you think that I have it right.

Thanks
--
Peter Geoghegan

Attachment Content-Type Size
v17-0001-Refactor-nbtree-insertion-scankeys.patch application/octet-stream 64.7 KB
v17-0005-Add-split-after-new-tuple-optimization.patch application/octet-stream 12.3 KB
v17-0002-Make-heap-TID-a-tiebreaker-nbtree-index-column.patch application/octet-stream 154.2 KB
v17-0003-Consider-secondary-factors-during-nbtree-splits.patch application/octet-stream 51.5 KB
v17-0004-Allow-tuples-to-be-relocated-from-root-by-amchec.patch application/octet-stream 15.6 KB
v17-0006-Add-high-key-continuescan-optimization.patch application/octet-stream 13.4 KB
v17-0007-DEBUG-Add-pageinspect-instrumentation.patch application/octet-stream 7.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Haribabu Kommi 2019-03-16 06:43:10 Re: Pluggable Storage - Andres's take
Previous Message Tom Lane 2019-03-16 01:50:08 Re: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?