Re: Delaying/avoiding BTreeTupleGetNAtts() call within _bt_compare()

From: Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Floris Van Nee <florisvannee(at)optiver(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: Delaying/avoiding BTreeTupleGetNAtts() call within _bt_compare()
Date: 2020-05-28 19:35:17
Message-ID: 1583D0EB-FCDB-4683-96BF-8D67974F9156@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On Mar 2, 2020, at 5:29 PM, Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
>
> On Sun, Mar 1, 2020 at 12:15 PM Floris Van Nee <florisvannee(at)optiver(dot)com> wrote:
>> Minor conflict with that patch indeed. Attached is rebased version. I'm running some tests now - will post the results here when finished.
>
> Thanks.
>
> We're going to have to go back to my original approach to inlining. At
> least, it seemed to be necessary to do that to get any benefit from
> the patch on my comparatively modest workstation (using a similar
> pgbench SELECT benchmark to the one that you ran). Tom also had a
> concern about the portability of inlining without using "static
> inline" -- that is another reason to avoid the approach to inlining
> taken by v3 + v4.
>
> It's possible (though not very likely) that performance has been
> impacted by the deduplication patch (commit 0d861bbb), since it
> updated the definition of BTreeTupleGetNAtts() itself.
>
> Attached is v5, which inlines in a targeted fashion, pretty much in
> the same way as the earliest version. This is the same as v4 in every
> other way. Perhaps you can test this.

Hi Peter, just a quick code review:

The v5 patch files apply and pass the regression tests. I get no errors. The performance impact is within the noise. The TPS with the patch are higher sometimes and lower other times, looking across the 27 subtests of the "select-only" benchmark. Which subtest is slower or faster changes per run, so that doesn't seem to be relevant. I ran the "select-only" six times (three with the patch, three without). The change you made to the loop is clearly visible in the nbtsearch.s output, and the change to inline _bt_compare is even more visible, so there doesn't seem to be any defeating of your change due to the compiler ignoring the "inline" or such. I compiled using gcc -O2

% gcc --version
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/include/c++/4.2.1
Apple clang version 11.0.0 (clang-1100.0.33.17)
Target: x86_64-apple-darwin19.4.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin

2.4 GHz 8-Core Intel Core i9
32 GB 2667 MHz DDR4

Reading this thread, I think the lack of a performance impact on laptop hardware was expected, but perhaps confirmation that it does not make things worse is useful?

Since this patch doesn't seem to do any harm, I would mark it as "ready for committer", except that there doesn't yet seem to be enough evidence that it is a net win.


Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Korotkov 2020-05-28 20:02:46 Re: Operator class parameters and sgml docs
Previous Message Tomas Vondra 2020-05-28 18:57:50 Re: Trouble with hashagg spill I/O pattern and costing