RE: Delaying/avoiding BTreeTupleGetNAtts() call within _bt_compare()

From: Floris Van Nee <florisvannee(at)Optiver(dot)com>
To: Floris Van Nee <florisvannee(at)Optiver(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Cc: Andres Freund <andres(at)anarazel(dot)de>
Subject: RE: Delaying/avoiding BTreeTupleGetNAtts() call within _bt_compare()
Date: 2020-01-28 21:34:34
Message-ID: 2ca215d085a74e2eabd31b76e97cc9f3@opammb0561.comp.optiver.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


>
> I could do some tests with the patch on some larger machines. What exact
> tests do you propose? Are there some specific postgresql.conf settings and
> pgbench initialization you recommend for this? And was the test above just
> running 'pgbench -S' select-only with specific -T, -j and -c parameters?
>

With Andres' instructions I ran a couple of tests. With your patches I can reproduce a speedup of ~3% on single core tests reliably on a dual-socket 36-core machine for the pgbench select-only test case. When using the full scale test my results are way too noisy even for large runs unfortunately. I also tried some other queries (for example select's that return 10 or 100 rows instead of just 1), but can't see much of a speed-up there either, although it also doesn't hurt.

So I guess the most noticeable one is the select-only benchmark for 1 core:

<Master>
transaction type: <builtin: select only>
scaling factor: 300
query mode: prepared
number of clients: 1
number of threads: 1
duration: 600 s
number of transactions actually processed: 30255419
latency average = 0.020 ms
latency stddev = 0.001 ms
tps = 50425.693234 (including connections establishing)
tps = 50425.841532 (excluding connections establishing)

<Patched>
transaction type: <builtin: select only>
scaling factor: 300
query mode: prepared
number of clients: 1
number of threads: 1
duration: 600 s
number of transactions actually processed: 31363398
latency average = 0.019 ms
latency stddev = 0.001 ms
tps = 52272.326597 (including connections establishing)
tps = 52272.476380 (excluding connections establishing)

This is the one with 40 clients, 40 threads. Not really an improvement, and quite still quite noisy.
<Master>
transaction type: <builtin: select only>
scaling factor: 300
query mode: prepared
number of clients: 40
number of threads: 40
duration: 600 s
number of transactions actually processed: 876846915
latency average = 0.027 ms
latency stddev = 0.015 ms
tps = 1461407.539610 (including connections establishing)
tps = 1461422.084486 (excluding connections establishing)

<Patched>
transaction type: <builtin: select only>
scaling factor: 300
query mode: prepared
number of clients: 40
number of threads: 40
duration: 600 s
number of transactions actually processed: 872633979
latency average = 0.027 ms
latency stddev = 0.038 ms
tps = 1454387.326179 (including connections establishing)
tps = 1454396.879195 (excluding connections establishing)

For tests that don't use the full machine (eg. 10 clients, 10 threads) I see speed-ups as well, but not as high as the single-core run. It seems there are other bottlenecks (on the machine) coming into play.

-Floris

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2020-01-28 21:58:32 Re: Removing pg_pltemplate and creating "trustable" extensions
Previous Message Ranier Vilela 2020-01-28 21:19:48 Re: [PATCH] Windows port, fix some resources leaks