Re: Making all nbtree entries unique by having heap TIDs participate in comparisons

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Claudio Freire <klaussfreire(at)gmail(dot)com>, Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>, "Andrey V(dot) Lepikhov" <a(dot)lepikhov(at)postgrespro(dot)ru>
Subject: Re: Making all nbtree entries unique by having heap TIDs participate in comparisons
Date: 2019-03-12 21:15:06
Message-ID: CAH2-Wzk601QUG25+RqPgxYDrgVPataWrTenHgTjrUR+ECAugZA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 12, 2019 at 12:40 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> Have you looked at an offwake or lwlock wait graph (bcc tools) or
> something in that vein? Would be interesting to see what is waiting for
> what most often...

Not recently, though I did use your BCC script for this very purpose
quite a few months ago. I don't remember it helping that much at the
time, but then that was with a version of the patch that lacked a
couple of important optimizations that we have now. We're now very
careful to not descend to the left with an equal pivot tuple. We
descend right instead when that's definitely the only place we'll find
matches (a high key doesn't count as a match in almost all cases!).
Edge-cases where we unnecessarily move left then right, or
unnecessarily move right a second time once on the leaf level have
been fixed. I fixed the regression I was worried about at the time,
without getting much benefit from the BCC script, and moved on.

This kind of minutiae is more important than it sounds. I have used
EXPLAIN(ANALYZE, BUFFERS) instrumentation to make sure that I
understand where every single block access comes from with these
edge-cases, paying close attention to the structure of the index, and
how the key space is broken up (the values of pivot tuples in internal
pages). It is one thing to make the index smaller, and another thing
to take full advantage of that -- I have both. This is one of the
reasons why I believe that this minor regression cannot be avoided,
short of simply allowing the index to get bloated: I'm simply not
doing things that differently outside of the page split code, and what
I am doing differently is clearly superior. Both in general, and for
the NEW_ORDER transaction in particular.

I'll make that another TODO item -- this regression will be revisited
using BCC instrumentation. I am currently performing a multi-day
benchmark on a very large TPC-C/BenchmarkSQL database, and it will
have to wait for that. (I would like to use the same environment as
before.)

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2019-03-12 21:21:58 Re: Making all nbtree entries unique by having heap TIDs participate in comparisons
Previous Message Fabien COELHO 2019-03-12 21:08:19 Re: Offline enabling/disabling of data checksums