Re: Improving connection scalability (src/backend/storage/ipc/procarray.c)

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Ranier Vilela <ranier(dot)vf(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Improving connection scalability (src/backend/storage/ipc/procarray.c)
Date: 2022-05-29 20:02:32
Message-ID: 164d0d4b-1ebd-8d4b-f66f-1dad09623929@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 5/29/22 19:26, Ranier Vilela wrote:
> ...
> I redid the benchmark with a better machine:
> Intel i7-10510U
> RAM 8GB
> SSD 512GB
> Linux Ubuntu 64 bits
>
> All files are attached, including the raw data of the results.
> I did the calculations as requested.
> But a quick average of the 10 benchmarks, done resulted in 10,000 tps more.
> Not bad, for a simple patch, made entirely of micro-optimizations.
>

I am a bit puzzled by the calculations.

It seems you somehow sum the differences for each run, and then average
that over all the runs. So, something like

SELECT avg(delta_tps) FROM (
SELECT run, SUM(patched_tps - master_tps) AS delta_tps
FROM results GROUP BY run
) foo;

That's certainly "unorthodox" way to evaluate the results, because it
mixes results for different client counts. That's certainly not what I
suggested, and it's a pretty useless view on the data, as it obfuscates
how throughput depends on the client count.

And no, the resulting 10k does not mean you've "gained" 10k tps anywhere
- none of the "diff" values is anywhere close to that value. If you
tested more client counts, you'd probably get bigger difference.
Compared to the "sum(tps)" for each run, it's like 0.8% difference. But
even that is entirely useless, due to mixing different client counts.

I'm sorry, but this is so silly it's hard to even explain why ...

What I meant is calculating median for each client count, so for example
for the master branch you get 10 values for 1 client

38820 39245 39773 39597 39301 39442 39379 39622 38909 38454

and if you calculate median, you'll get 39340 (and stdev 411). And same
for the other client counts, etc. If you do that, you'll get this:

clients master patched diff
------------------------------------
1 39340 40173 2.12%
10 132462 134274 1.37%
50 115669 116575 0.78%
100 97931 98816 0.90%
200 88912 89660 0.84%
300 87879 88636 0.86%
400 87721 88219 0.57%
500 87267 88078 0.93%
600 87317 87781 0.53%
700 86907 87603 0.80%
800 86852 87364 0.59%
900 86578 87173 0.69%
1000 86481 86969 0.56%

How exactly this improves scalability is completely unclear to me.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Ranier Vilela 2022-05-29 20:10:05 Re: Improving connection scalability (src/backend/storage/ipc/procarray.c)
Previous Message Andres Freund 2022-05-29 18:21:44 Re: Improving connection scalability (src/backend/storage/ipc/procarray.c)