Re: Improving connection scalability (src/backend/storage/ipc/procarray.c)

From: Ranier Vilela <ranier(dot)vf(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Improving connection scalability (src/backend/storage/ipc/procarray.c)
Date: 2022-05-27 13:35:08
Message-ID: CAEudQAocUznNq59O5ykQBTN_KB4odiJK0r5ow_4dy78FOBYg_A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Em qui., 26 de mai. de 2022 às 22:30, Tomas Vondra <
tomas(dot)vondra(at)enterprisedb(dot)com> escreveu:

> On 5/27/22 02:11, Ranier Vilela wrote:
> >
> > ...
> >
> > Here the results with -T 60:
>
> Might be a good idea to share your analysis / interpretation of the
> results, not just the raw data. After all, the change is being proposed
> by you, so do you think this shows the change is beneficial?
>
I think so, but the expectation has diminished.
I expected that the more connections, the better the performance.
And for both patch and head, this doesn't happen in tests.
Performance degrades with a greater number of connections.
GetSnapShowData() isn't a bottleneck?

>
> > Linux Ubuntu 64 bits
> > shared_buffers = 128MB
> >
> > ./pgbench -M prepared -c $conns -j $conns -T 60 -S -n -U postgres
> >
> > pgbench (15beta1)
> > transaction type: <builtin: select only>
> > scaling factor: 1
> > query mode: prepared
> > number of clients: 100
> > number of threads: 100
> > maximum number of tries: 1
> > duration: 60 s
> >
> > conns tps head tps patched
> >
> > 1 17126.326108 17792.414234
> > 10 82068.123383 82468.334836
> > 50 73808.731404 74678.839428
> > 80 73290.191713 73116.553986
> > 90 67558.483043 68384.906949
> > 100 65960.982801 66997.793777
> > 200 62216.011998 62870.243385
> > 300 62924.225658 62796.157548
> > 400 62278.099704 63129.555135
> > 500 63257.930870 62188.825044
> > 600 61479.890611 61517.913967
> > 700 61139.354053 61327.898847
> > 800 60833.663791 61517.913967
> > 900 61305.129642 61248.336593
> > 1000 60990.918719 61041.670996
> >
>
> These results look much saner, but IMHO it also does not show any clear
> benefit of the patch. Or are you still claiming there is a benefit?
>
We agree that they are micro-optimizations.
However, I think they should be considered micro-optimizations in inner
loops,
because all in procarray.c is a hotpath.
The first objective, I believe, was achieved, with no performance
regression.
I agree, the gains are small, by the tests done.
But, IMHO, this is a good way, small gains turn into big gains in the end,
when applied to all code.

Consider GetSnapShotData()
1. Most of the time the snapshot is not null, so:
if (snaphost == NULL), will fail most of the time.

With the patch:
if (snapshot->xip != NULL)
{
if (GetSnapshotDataReuse(snapshot))
return snapshot;
}

Most of the time the test is true and GetSnapshotDataReuse is not called
for new
snapshots.

count, subcount and suboverflowed, will not be initialized, for all
snapshots.

2. If snapshot is taken during recoverys
The pgprocnos and ProcGlobal->subxidStates are not touched unnecessarily.
Only if is not suboverflowed.
Skipping all InvalidTransactionId, mypgxactoff, backends doing logical
decoding,
and XID is >= xmax.

3. Calling GetSnapshotDataReuse() without first acquiring ProcArrayLock.
There's an agreement that this would be fine, for now.

Consider ComputeXidHorizons()
1. ProcGlobal->statusFlags is touched before the lock.
2. allStatusFlags[index] is not touched for all numProcs.

All changes were made with the aim of avoiding or postponing unnecessary
work.

> BTW it's generally a good idea to do multiple runs and then use the
> average and/or median. Results from a single may be quite noisy.
>
> >
> > Linux Ubuntu 64 bits
> > shared_buffers = 2048MB
> >
> > ./pgbench -M prepared -c $conns -j $conns -S -n -U postgres
> >
> > pgbench (15beta1)
> > transaction type: <builtin: select only>
> > scaling factor: 1
> > query mode: prepared
> > number of clients: 100
> > number of threads: 100
> > maximum number of tries: 1
> > number of transactions per client: 10
> >
> > conns tps head tps patched
> >
> > 1 2918.004085 3211.303789
> > 10 12262.415696 15540.015540
> > 50 13656.724571 16701.182444
> > 80 14338.202348 16628.559551
> > 90 16597.510373 16835.016835
> > 100 17706.775793 16607.433487
> > 200 16877.067441 16426.969799
> > 300 16942.260775 16319.780662
> > 400 16794.514911 16155.023607
> > 500 16598.502151 16051.106724
> > 600 16717.935001 16007.171213
> > 700 16651.204834 16004.353184
> > 800 16467.546583 16834.591719
> > 900 16588.241149 16693.902459
> > 1000 16564.985265 16936.952195
> >
>
> I think we've agreed these results are useless.
>
> >
> > Linux Ubuntu 64 bits
> > shared_buffers = 2048MB
> >
> > ./pgbench -M prepared -c $conns -j $conns -T 60 -S -n -U postgres
> >
> > pgbench (15beta1)
> > transaction type: <builtin: select only>
> > scaling factor: 1
> > query mode: prepared
> > number of clients: 100
> > number of threads: 100
> > maximum number of tries: 1
> > duration: 60 s
> >
> > conns tps head tps patched
> >
> > 1 17174.265804 17792.414234
> > 10 82365.634750 82468.334836
> > 50 74593.714180 74678.839428
> > 80 69219.756038 73116.553986
> > 90 67419.574189 68384.906949
> > 100 66613.771701 66997.793777
> > 200 61739.784830 62870.243385
> > 300 62109.691298 62796.157548
> > 400 61630.822446 63129.555135
> > 500 61711.019964 62755.190389
> > 600 60620.010181 61517.913967
> > 700 60303.317736 61688.044232
> > 800 60451.113573 61076.666572
> > 900 60017.327157 61256.290037
> > 1000 60088.823434 60986.799312
> >
>
> I have no idea why shared buffers 2GB would be interesting. The proposed
> change was related to procarray, not shared buffers. And scale 1 is
> ~15MB of data, so it fits into 128MB just fine.
>
I thought about doing this benchmark, in the most common usage situation
(25% of RAM).

> Also, the first ~10 results for "patched" case match results for 128MB
> shared buffers. That seems very unlikely to happen by chance, so this
> seems rather suspicious.
>
Probably, copy and paste mistake.
I redid this test, for patched:

Linux Ubuntu 64 bits
shared_buffers = 2048MB

./pgbench -M prepared -c $conns -j $conns -T 60 -S -n -U postgres

pgbench (15beta1)
transaction type: <builtin: select only>
scaling factor: 1
query mode: prepared
number of clients: 100
number of threads: 100
maximum number of tries: 1
duration: 60 s

conns tps head tps patched

1 17174.265804 17524.482668
10 82365.634750 81840.537713
50 74593.714180 74806.729434
80 69219.756038 73116.553986
90 67419.574189 69130.749209
100 66613.771701 67478.234595
200 61739.784830 63094.202413
300 62109.691298 62984.501251
400 61630.822446 63243.232816
500 61711.019964 62827.977636
600 60620.010181 62838.051693
700 60303.317736 61594.629618
800 60451.113573 61208.629058
900 60017.327157 61171.001256
1000 60088.823434 60558.067810

regards,
Ranier Vilela

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Maxim Orlov 2022-05-27 13:48:48 Re: Add 64-bit XIDs into PostgreSQL 15
Previous Message Yura Sokolov 2022-05-27 13:12:37 Re: PG15 beta1 sort performance regression due to Generation context change