Re: Improving connection scalability (src/backend/storage/ipc/procarray.c)

From: Ranier Vilela <ranier(dot)vf(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Improving connection scalability (src/backend/storage/ipc/procarray.c)
Date: 2022-05-28 00:36:30
Message-ID: CAEudQAqnAgrKmcx+zhOgHJ8YULhHMQq2kmZfk3u+axaCdXcVpw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Em sex., 27 de mai. de 2022 às 18:22, Andres Freund <andres(at)anarazel(dot)de>
escreveu:

> Hi,
>
> On 2022-05-27 10:35:08 -0300, Ranier Vilela wrote:
> > Em qui., 26 de mai. de 2022 às 22:30, Tomas Vondra <
> > tomas(dot)vondra(at)enterprisedb(dot)com> escreveu:
> >
> > > On 5/27/22 02:11, Ranier Vilela wrote:
> > > >
> > > > ...
> > > >
> > > > Here the results with -T 60:
> > >
> > > Might be a good idea to share your analysis / interpretation of the
> > > results, not just the raw data. After all, the change is being proposed
> > > by you, so do you think this shows the change is beneficial?
> > >
> > I think so, but the expectation has diminished.
> > I expected that the more connections, the better the performance.
> > And for both patch and head, this doesn't happen in tests.
> > Performance degrades with a greater number of connections.
>
> Your system has four CPUs. Once they're all busy, adding more connections
> won't improve performance. It'll just add more and more context switching,
> cache misses, and make the OS scheduler do more work.
>
conns tps head
10 82365.634750
50 74593.714180
80 69219.756038
90 67419.574189
100 66613.771701
Yes it is quite disappointing that with 100 connections, tps loses to 10
connections.

>
>
>
> > GetSnapShowData() isn't a bottleneck?
>
> I'd be surprised if it showed up in a profile on your machine with that
> workload in any sort of meaningful way. The snapshot reuse logic will
> always
> work - because there are no writes - and thus the only work that needs to
> be
> done is to acquire the ProcArrayLock briefly. And because there is only a
> small number of cores, contention on the cacheline for that isn't a
> problem.
>
Thanks for sharing this.

>
>
> > > These results look much saner, but IMHO it also does not show any clear
> > > benefit of the patch. Or are you still claiming there is a benefit?
> > >
> > We agree that they are micro-optimizations. However, I think they
> should be
> > considered micro-optimizations in inner loops, because all in
> procarray.c is
> > a hotpath.
>
> As explained earlier, I don't agree that they optimize anything - you're
> making some of the scalability behaviour *worse*, if it's changed at all.
>
>
> > The first objective, I believe, was achieved, with no performance
> > regression.
> > I agree, the gains are small, by the tests done.
>
> There are no gains.
>
IMHO, I must disagree.

>
>
> > But, IMHO, this is a good way, small gains turn into big gains in the
> end,
> > when applied to all code.
> >
> > Consider GetSnapShotData()
> > 1. Most of the time the snapshot is not null, so:
> > if (snaphost == NULL), will fail most of the time.
> >
> > With the patch:
> > if (snapshot->xip != NULL)
> > {
> > if (GetSnapshotDataReuse(snapshot))
> > return snapshot;
> > }
> >
> > Most of the time the test is true and GetSnapshotDataReuse is not called
> > for new
> > snapshots.
> > count, subcount and suboverflowed, will not be initialized, for all
> > snapshots.
>
> But that's irrelevant. There's only a few "new" snapshots in the life of a
> connection. You're optimizing something irrelevant.
>
IMHO, when GetSnapShotData() is the bottleneck, all is relevant.

>
>
> > 2. If snapshot is taken during recoverys
> > The pgprocnos and ProcGlobal->subxidStates are not touched unnecessarily.
>
> That code isn't reached when in recovery?
>
Currently it is reached *even* when not in recovery.
With the patch, *only* is reached when in recovery.

>
> > 3. Calling GetSnapshotDataReuse() without first acquiring ProcArrayLock.
> > There's an agreement that this would be fine, for now.
>
> There's no such agreement at all. It's not correct.
>
Ok, but there is a chance it will work correctly.

>
> > Consider ComputeXidHorizons()
> > 1. ProcGlobal->statusFlags is touched before the lock.
>
> Hard to believe that'd have a measurable effect.
>
IMHO, anything you take out of the lock is a benefit.

>
>
> > 2. allStatusFlags[index] is not touched for all numProcs.
>
> I'd be surprised if the compiler couldn't defer that load on its own.
>
Better be sure of that, no?

regards,
Ranier Vilela

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2022-05-28 01:13:20 Re: SLRUs in the main buffer pool, redux
Previous Message Ranier Vilela 2022-05-28 00:15:50 Re: Improving connection scalability (src/backend/storage/ipc/procarray.c)