Re: Postgres with pthread

From: Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>
To: Andres Freund <andres(at)anarazel(dot)de>, james(at)mansionfamily(dot)plus(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Postgres with pthread
Date: 2017-12-27 11:17:11
Message-ID: 088b4bc3-9033-ea7d-eb46-20d8a3890424@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 27.12.2017 13:08, Andres Freund wrote:
>
> On December 27, 2017 11:05:52 AM GMT+01:00, james <james(at)mansionfamily(dot)plus(dot)com> wrote:
>>> All threads are blocked in semaphores.
>> That they are blocked is inevitable - I guess the issue is that they
>> are
>> thrashing.
>> I guess it would be necessary to separate the internals to have some
>> internal queueing and effectively reduce the number of actively
>> executing threads.
>> In effect make the connection pooling work internally.
>>
>> Would it be possible to make the caches have persistent (functional)
>> data structures - effectively CoW?
>>
>> And how easy would it be to abort if the master view had subsequently
>> changed when it comes to execution?
> Optimizing for this seems like a pointless exercise. If the goal is efficient processing of 100k connections the solution is a session / connection abstraction and a scheduler. Optimizing for this amount of concurrency just will add complexity and slowdowns for a workload that nobody will run.
I agree with you that supporting 100k active connections has not so much
practical sense now.
But there are many systems with hundreds of cores and to utilize them we
still need spawn thousands of backends.
In this case Postgres snaphots and local caches becomes inefficient.
Switching to CSN allows to somehow solve the problem with snapshots.
But the problems with private caches should also be addressed: it seems
to be very stupid to perform the same work 1000x times and maintain
1000x copies.
Also, in case of global prepared statements, presence of global cache
allows to spend more time in plan optimization use manual tuning.

Switching to pthreads model significantly simplify  development of
shared caches: there are no problems with statically allocated shared
address space or dynamic segments mapped on different address, not
allowing to use normal pointer. Also invalidation of shared cache is
easier: on need to send invalidation notifications to all backends.
But still it requires a lot of work. For example catalog cache is
tightly integrated with resource owner's information.
Also shared cache requires synchronization and this synchronization
itself can become a bottleneck.

> Andres

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2017-12-27 11:39:30 Re: Add hint about replication slots when nearing wraparound
Previous Message james 2017-12-27 11:13:04 Re: Postgres with pthread