|From:||Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>|
|To:||Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>|
|Cc:||Ryan Lambert <ryan(at)rustprooflabs(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Bruce Momjian <bruce(at)momjian(dot)us>, Dimitri Fontaine <dim(at)tapoueh(dot)org>|
|Subject:||Re: Built-in connection pooler|
|Views:||Raw Message | Whole Thread | Download mbox | Resend email|
On Mon, Jul 29, 2019 at 07:14:27PM +0300, Konstantin Knizhnik wrote:
>On 26.07.2019 23:24, Tomas Vondra wrote:
>>I've started reviewing this patch and experimenting with it, so let me
>>share some initial thoughts.
>>1) not handling session state (yet)
>>I understand handling session state would mean additional complexity, so
>>I'm OK with not having it in v1. That being said, I think this is the
>>primary issue with connection pooling on PostgreSQL - configuring and
>>running a separate pool is not free, of course, but when people complain
>>to us it's when they can't actually use a connection pool because of
>>So what are your plans regarding this feature? I think you mentioned
>>you already have the code in another product. Do you plan to submit it
>>in the pg13 cycle, or what's the plan? I'm willing to put some effort
>>into reviewing and testing that.
>I completely agree with you. My original motivation of implementation
>of built-in connection pooler
>was to be able to preserve session semantic (prepared statements,
>GUCs, temporary tables) for pooled connections.
>Almost all production system have to use some kind of pooling. But in
>case of using pgbouncer&Co we are loosing possibility
>to use prepared statements which can cause up to two time performance
>penalty (in simple OLTP queries).
>So I have implemented such version of connection pooler of PgPro EE.
>It require many changes in Postgres core so I realized that there are
>no chances to commit in community
>(taken in account that may other my patches like autoprepare and libpq
>compression are postponed for very long time, although
>them are much smaller and less invasive).
>Then Dimitri Fontaine proposed me to implement much simple version of
>pooler based on traditional proxy approach.
>This patch is result of our conversation with Dimitri.
>You are asking me about my plans... I think that it will be better to
>try first to polish this version of the patch and commit it and only
>after it add more sophisticated features
>like saving/restoring session state.
Well, I understand the history of this patch, and I have no problem with
getting a v1 of a connection pool without this feature. After all,
that's the idea of incremental development. But that only works when v1
allows adding that feature in v2, and I can't quite judge that. Which
is why I've asked you about your plans, because you clearly have more
insight thanks to writing the pooler for PgPro EE.
>>FWIW it'd be nice to expose it as some sort of interface, so that other
>>connection pools can leverage it too. There are use cases that don't
>>work with a built-in connection pool (say, PAUSE/RESUME in pgbouncer
>>allows restarting the database) so projects like pgbouncer or odyssey
>>are unlikely to disappear anytime soon.
>Obviously built-in connection pooler will never completely substitute
>external poolers like pgbouncer, which provide more flexibility, i.e.
>make it possible to install pooler at separate host or at client side.
Sure. But that wasn't really my point - I was suggesting to expose this
hypothetical feature (managing session state) as some sort of API usable
from other connection pools.
>>I also wonder if we could make it more permissive even in v1, without
>>implementing dump/restore of session state.
>>Consider for example patterns like this:
>> SET LOCAL enable_nestloop = off;
>> PREPARE x(int) AS SELECT ...;
>> EXECUTE x(1);
>> EXECUTE x(2);
>> EXECUTE x(100000);
>> DEALLOCATE x;
>>or perhaps even
>> CREATE FUNCTION f() AS $$ ... $$
>> LANGUAGE sql
>> SET enable_nestloop = off;
>>In all those cases (and I'm sure there are other similar examples) the
>>connection pool considers the session 'tainted' it marks it as tainted
>>and we never reset that. So even when an application tries to play nice,
>>it can't use pooling.
>>Would it be possible to maybe track this with more detail (number of
>>prepared statements, ignore SET LOCAL, ...)? That should allow us to do
>>pooling even without full support for restoring session state.
>Sorry, I do not completely understand your idea (how to implement this
>features without maintaining session state).
My idea (sorry if it wasn't too clear) was that we might handle some
cases more gracefully.
For example, if we only switch between transactions, we don't quite care
about 'SET LOCAL' (but the current patch does set the tainted flag). The
same thing applies to GUCs set for a function.
For prepared statements, we might count the number of statements we
prepared and deallocated, and treat it as 'not tained' when there are no
statements. Maybe there's some risk I can't think of.
The same thing applies to temporary tables - if you create and drop a
temporary table, is there a reason to still treat the session as tained?
>To implement prepared statements we need to store them in session
>context or at least add some session specific prefix to prepare
>Temporary tables also require per-session temporary table space. With
>GUCs situation is even more complicated - actually most of the time in
>my PgPro-EE pooler version
>I have spent in the fight with GUCs (default values, reloading
>configuration, memory alllocation/deallocation,...).
>But the "show stopper" are temporary tables: if them are accessed
>through internal (non-shared buffer), then you can not reschedule
>session to some other backend.
>This is why I have now patch with implementation of global temporary
>tables (a-la Oracle) which has global metadata and are accessed though
>shared buffers (which also allows to use them
>in parallel queries).
Yeah, temporary tables are messy. Global temporary tables would be nice,
not just because of this, but also because of catalog bloat.
>>I think we need to rethink how the pool is configured. The options
>>available at the moment are more a consequence of the implementation and
>>are rather cumbersome to use in some cases.
>>For example, we have session_pool_size, which is (essentially) the
>>number of backends kept in the pool. Which seems fine at first, because
>>it seems like you might say
>> max_connections = 100
>> session_pool_size = 50
>>to say the connection pool will only ever use 50 connections, leaving
>>the rest for "direct" connection. But that does not work at all, because
>>the number of backends the pool can open is
>> session_pool_size * connection_proxies * databases * roles
>>which pretty much means there's no limit, because while we can specify
>>the number of proxies, the number of databases and roles is arbitrary.
>>And there's no way to restrict which dbs/roles can use the pool.
>>So you can happily do this
>> max_connections = 100
>> connection_proxies = 4
>> session_pool_size = 10
>> pgbench -c 24 -U user1 test1
>> pgbench -c 24 -U user2 test2
>> pgbench -c 24 -U user3 test3
>> pgbench -c 24 -U user4 test4
>>at which point it's pretty much game over, because each proxy has 4
>>pools, each with ~6 backends, 96 backends in total. And because
>>non-tainted connections are never closed, no other users/dbs can use the
>>pool (will just wait indefinitely).
>>To allow practical configurations, I think we need to be able to define:
>>* which users/dbs can use the connection pool
>>* minimum/maximum pool size per user, per db and per user/db
>>* maximum number of backend connections
>>We need to be able to close connections when needed (when not assigned,
>>and we need the connection for someone else).
>>Plus those limits need to be global, not "per proxy" - it's just strange
>>that increasing connection_proxies bumps up the effective pool size.
>>I don't know what's the best way to specify this configuration - whether
>>to store it in a separate file, in some system catalog, or what.
>Well, I agree with you, that maintaining separate connection pool for
>each database/role pain may be confusing.
Anything can be confusing ...
>My assumption was that in many configurations application are
>accessing the same (or few databases) with one (or very small) number
>If you have hundreds of databases or users (each connection to the
>database under its OS name), then
>connection pooler will not work in any case, doesn't matter how you
>will configure it. It is true also for pgbouncer and any other pooler.
Sure, but I don't expect connection pool to work in such cases.
But I do expect to be able to configure which users can use the
connection pool at all, and maybe assign them different pool sizes.
>If Postgres backend is able to work only with on database, then you
>will have to start at least such number of backends as number of
>databases you have.
>Situation with users is more obscure - it may be possible to implement
>multiuser access to the same backend (as it can be done now using "set
I don't think I've said we need anything like that. The way I'd expect
it to work that when we run out of backend connections, we terminate
some existing ones (and then fork new backends).
>So I am not sure that if we implement sophisticated configurator which
>allows to specify in some configuration file for each database/role
>pair maximal/optimal number
>of workers, then it completely eliminate the problem with multiple
Why would we need to invent any sophisticated configurator? Why couldn't
we use some version of what pgbouncer already does, or maybe integrate
it somehow into pg_hba.conf?
>Particularly, assume that we have 3 databases and want to server them
>with 10 workers.
>Now we receive 10 requests to database A. We start 10 backends which
>server this queries.
>The we receive 10 requests to database B. What should we do then.
>Terminate all this 10 backends and start new 10
>instead of them? Or should we start 3 workers for database A, 3
>workers for database B and 4 workers for database C.
>In this case of most of requests are to database A, we will not be
>able to utilize all system resources.
>Certainly we can specify in configuration file that database A needs 6
>workers and B/C - two workers.
>But it will work only in case if we statically know workload...
My concern is not as much performance as inability to access the
database at all. There's no reasonable way to "guarantee" some number of
connections to a given database. Which is what pgbouncer does (through
Yes, it requires knowledge of the workload, and I don't think that's an
>So I have though a lot about it, but failed to find some good and
>Looks like if you wan to efficiently do connection pooler, you should
>restrict number of
>database and roles.
I agree we should not over-complicate this, but I still find the current
>>I think we need much better monitoring capabilities. At this point we
>>have a single system catalog (well, a SRF) giving us proxy-level
>>summary. But I think we need much more detailed overview - probably
>>something like pgbouncer has - listing of client/backend sessions, with
>>Of course, that's difficult to do when those lists are stored in private
>>memory of each proxy process - I think we need to move this to shared
>>memory, which would also help to address some of the issues I mentioned
>>in the previous section (particularly that the limits need to be global,
>>not per proxy).
>I also agree that more monitoring facilities are needed.
>Just want to get better understanding what kind of information we need
>As far as pooler is done at transaction level, all non-active session
>are in idle state
>and state of active sessions can be inspected using pg_stat_activity.
Except when sessions are tainted, for example. And when the transactions
are long-running, it's still useful to list the connections.
I'd suggest looking at the stats available in pgbouncer, most of that
actually comes from practice (monitoring metrics, etc.)
>>I find it quite strange that restart_pooler_on_reload binds restart of
>>the connection pool to reload of the configuration file. That seems like
>>a rather surprising behavior, and I don't see why would you ever want
>>that? Currently it seems like the only way to force the proxies to close
>>the connections (the docs mention DROP DATABASE), but why shouldn't we
>>have separate functions to do that? In particular, why would you want to
>>close connections for all databases and not just for the one you're
>>trying to drop?
>Reload configuration is already broadcasted to all backends.
>In case of using some other approach for controlling pool worker,
>it will be necessary to implement own notification mechanism.
>Certainly it is doable. But as I already wrote, the primary idea was
>this patch and make it as less invasive as possible.
>>It's nice we support different strategies to assign connections to
>>worker processes, but how do you tune it? How do you pick the right
>>option for your workload? We either need to provide metrics to allow
>>informed decision, or just not provide the option.
>The honest answer for this question is "I don't know".
>I have just implemented few different policies and assume that people
>will test them on their workloads and
>tell me which one will be most efficient. Then it will be possible to
>give some recommendations how to
>Also current criteria for "load-balancing" may be too dubious.
>May be formula should include some other metrics rather than just
>number of connected clients.
>>And "load average" may be a bit misleading term (as used in the section
>>about load-balancing option). It kinda suggests we're measuring how busy
>>the different proxies were recently (that's what load average in Unix
>>does) - by counting active processes, CPU usage or whatever. But AFAICS
>>that's not what's happening at all - it just counts the connections,
>>with SSL connections counted as more expensive.
>Generally I agree. Current criteria for "load-balancing" may be too dubious.
>May be formula should include some other metrics rather than just
>number of connected clients.
>But I failed to find such metrices. CPU usage? But proxy themselve are
>using CPU only for redirecting traffic.
>Assume that one proxy is serving 10 clients performing OLAP queries
>and another one 100 clients performing OLTP queries.
>Certainly OLTP queries are used to be executed much faster. But it is
>hard to estimate amount of transferred data for both proxies.
>Generally OLTP queries are used to access few records, while OLAP
>access much more data. But OLAP queries usually performs some
>so final result may be also small...
>Looks like we need to measure not only load of proxy itself but also
>load of proxies connected to this proxy.
>But it requires much more efforts.
I think "smart" load-balancing is fairly difficult to get right. I'd
just cut it from initial patch, keeping just the simple strategies
>>6) issues during testin
>>While testing, I've seen a couple of issues. Firstly, after specifying a
>>db that does not exist:
>> psql -h localhost -p 6543 xyz
>>just hangs and waits forever. In the server log I see this:
>> 2019-07-25 23:16:50.229 CEST  FATAL: database "xyz" does
>> 2019-07-25 23:16:50.258 CEST  WARNING: could not setup
>>local connect to server
>> 2019-07-25 23:16:50.258 CEST  DETAIL: FATAL: database
>>"xyz" does not exist
>>But the client somehow does not get the message and waits.
>>Secondly, when trying this
>> pgbench -p 5432 -U x -i -s 1 test
>> pgbench -p 6543 -U x -c 24 -C -T 10 test
>>it very quickly locks up, with plenty of non-granted locks in pg_locks,
>>but I don't see any interventions by deadlock detector so I presume
>>the issue is somewhere else. I don't see any such issues whe running
>>without the connection pool or without the -C option:
>> pgbench -p 5432 -U x -c 24 -C -T 10 test
>> pgbench -p 6543 -U x -c 24 -T 10 test
>>This is with default postgresql.conf, except for
>> connection_proxies = 4
>I need more time to investigate this problem.
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
|Next Message||Jeevan Ladhe||2019-07-30 01:39:06||Re: block-level incremental backup|
|Previous Message||Robert Haas||2019-07-30 00:57:15||Re: TopoSort() fix|