Re: Making all nbtree entries unique by having heap TIDs participate in comparisons

From: Andres Freund <andres(at)anarazel(dot)de>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Claudio Freire <klaussfreire(at)gmail(dot)com>, Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>, "Andrey V(dot) Lepikhov" <a(dot)lepikhov(at)postgrespro(dot)ru>
Subject: Re: Making all nbtree entries unique by having heap TIDs participate in comparisons
Date: 2019-03-12 21:21:58
Message-ID: 20190312212158.jlofrqudkyw3ikbk@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2019-03-12 14:15:06 -0700, Peter Geoghegan wrote:
> On Tue, Mar 12, 2019 at 12:40 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> > Have you looked at an offwake or lwlock wait graph (bcc tools) or
> > something in that vein? Would be interesting to see what is waiting for
> > what most often...
>
> Not recently, though I did use your BCC script for this very purpose
> quite a few months ago. I don't remember it helping that much at the
> time, but then that was with a version of the patch that lacked a
> couple of important optimizations that we have now. We're now very
> careful to not descend to the left with an equal pivot tuple. We
> descend right instead when that's definitely the only place we'll find
> matches (a high key doesn't count as a match in almost all cases!).
> Edge-cases where we unnecessarily move left then right, or
> unnecessarily move right a second time once on the leaf level have
> been fixed. I fixed the regression I was worried about at the time,
> without getting much benefit from the BCC script, and moved on.
>
> This kind of minutiae is more important than it sounds. I have used
> EXPLAIN(ANALYZE, BUFFERS) instrumentation to make sure that I
> understand where every single block access comes from with these
> edge-cases, paying close attention to the structure of the index, and
> how the key space is broken up (the values of pivot tuples in internal
> pages). It is one thing to make the index smaller, and another thing
> to take full advantage of that -- I have both. This is one of the
> reasons why I believe that this minor regression cannot be avoided,
> short of simply allowing the index to get bloated: I'm simply not
> doing things that differently outside of the page split code, and what
> I am doing differently is clearly superior. Both in general, and for
> the NEW_ORDER transaction in particular.
>
> I'll make that another TODO item -- this regression will be revisited
> using BCC instrumentation. I am currently performing a multi-day
> benchmark on a very large TPC-C/BenchmarkSQL database, and it will
> have to wait for that. (I would like to use the same environment as
> before.)

I'm basically just curious which buffers have most of the additional
contention. Is it the lower number of leaf pages, the inner pages, or
(somewhat unexplicably) the meta page, or ...? I was thinking that the
callstack that e.g. my lwlock tool gives should be able to explain what
callstack most of the waits are occuring on.

(I should work a bit on that script, I locally had a version that showed
both waiters and the waking up callstack, but I don't find it anymore)

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2019-03-12 21:27:18 Re: Making all nbtree entries unique by having heap TIDs participate in comparisons
Previous Message Peter Geoghegan 2019-03-12 21:15:06 Re: Making all nbtree entries unique by having heap TIDs participate in comparisons