Re: Init connection time grows quadratically

From: "Maksim(dot)Melnikov" <m(dot)melnikov(at)postgrespro(dot)ru>
To: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
Cc: Потапов Александр <a(dot)potapov(at)postgrespro(dot)ru>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Init connection time grows quadratically
Date: 2026-06-11 14:31:47
Message-ID: 08e7c338-e175-4474-be53-6538bb67d4b6@postgrespro.ru
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


On 6/3/26 16:35, Matthias van de Meent wrote:
> On Wed, 3 Jun 2026 at 08:33, Maksim.Melnikov<m(dot)melnikov(at)postgrespro(dot)ru> wrote:
>> On 6/16/25 11:56, Потапов Александр wrote:
>>
>>> To be more precise I used constant number of threads (128 and 1024) to compare with previous results. The quadratic dependency exists everywhere, see new graph.
>>>
>>>> Q: Did you check that pgbench or the OS does not have
>>>> O(n_active_connections) or O(n_active_threads) overhead per worker
>>>> during thread creation or connection establishment, e.g. by varying
>>>> the number of threads used to manage these N clients? I wouldn't be
>>>> surprised if there are inefficiencies in e.g. the threading- or
>>>> synchronization model that cause O(N) per-thread overhead, or O(N^2)
>>>> overall when you have one thread per connection.
>> Hi, all!
>>
>> I've investigated slightly different scenario then Alexander and I want share my thoughts in this thread too.
>>
>> I found that when we run pgbench scenarios sequantially, without postgres restart between iterations, initial time degrades from launch to launch and eventually it stabilizes at the worst values then first run(ICT_degradation.png attached).
>>
>> Scenario details:
> [...]
>> 4.Add to the postgresql.conf:
>> huge_pages = off #for the sake of test stability and reproducibility
> I think this is the main culprit of the extreme slowdown -- without
> huge pages, you're effectively guaranteed to get many minor page
> faults, and with it the relevant TLB miss rates. With huge pages
> enabled, the proc array should fit on one (or just a few) memory
> pages.
>
> We're not generally in the business for optimizing workloads that have
> huge_pages=off.

Yes, I agree, huge_pages=off is not  common setup now. My motivation was
that even if some configuration isn't commonly used, it does not mean
that it isn't interesting for someone else at all and, as a consequence,
it can be optimized without degradation for basic scenarios . Moreover,
huge_pages = try is the default value, so with huge_pages set to try,
the server will try to request huge pages, but fall back to the
huge_page=off if that fails. As I know on linux default value
for  vm.nr_hugepages = 0, this means that by default, the os does not
use HugeTLB pages. Of course, DBA should setup this, but on practice
they can miss this. Anyway, if community isn't interested in such kinds
of optimizations, it is ok. It was interesting and educational
investigation for me, thanks for your help.

> .....
>
>
>> as we can see, patched version fixes this. I made a series of measurements for all versions and attached comparison chart(ICT_degradation_with_patch.png attached). Also I add the table with results
> Do you happen to have data with huge_pages enabled?
>
>> I hope it will be interesting and helpful.
> Definitely interesting. I'm not so sure it's as effective on a
> production configuration (with huge pages enabled), but I'm definitely
> interested in seeing test results.

I've made comparative measurements for configurations with huge_pages =
on/off. Please, you can check results below.

Clients number *Huge-pages-off-with-patch*
Huge-pages-off-without-patch Huge-pages-on-with-patch
Huge-pages-on-without-patch
512 ~480 +- 3.5% ms ~490 +- 3% ms ~420 +- 3.5% ms ~420+-3.5% ms
1024 ~910 +- 1.3% ms ~990 +- 2% ms ~790 +- 1.7% ms ~800+-1.8% ms
2048 ~1810 +- 1.4% ms ~2230 +- 0.9% ms ~1540 +- 0.7% ms ~1530 +-
1.4% ms
4096 ~3690 +- 1.9% ms ~6060 +- 0.8% ms ~3070 +- 0.6% ms ~3070 +-
0.9% ms
8192 ~9900 +- 0.6% ms ~18530 +- 0.4% ms ~6220 +- 0.7% ms ~6230 +-
0.7% ms

Also comparison chart is attached.

As we can see the measurements prove patch efficiency for configuration
with huge_page=off(the same result as in previous message), but for
huge_pages=on I've got the same results for both versions, no
improvement and no degradation.

> ----
>
> Some comments on the patch:
>
Patch with fixes was attached. Thanks for review.

Best regards,

Maksim Melnikov

Attachment Content-Type Size
image/png 49.7 KB
v2-0001-This-patch-reduce-connection-init-close-time.patch text/x-patch 14.2 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Previous Message Nathan Bossart 2026-06-11 14:21:28 Re: bump minimum supported version of psql and pg_{dump,dumpall,upgrade} to v10