Re: Built-in connection pooler

From: Antonin Houska <ah(at)cybertec(dot)at>
To: Konstantin Knizhnik <knizhnik(at)garret(dot)ru>
Cc: Daniel Gustafsson <daniel(at)yesql(dot)se>, Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>, Michael Paquier <michael(at)paquier(dot)xyz>, David Steele <david(at)pgmasters(dot)net>, "ideriha(dot)takeshi(at)fujitsu(dot)com" <ideriha(dot)takeshi(at)fujitsu(dot)com>, Jaime Casanova <jaime(dot)casanova(at)2ndquadrant(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Ryan Lambert <ryan(at)rustprooflabs(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Bruce Momjian <bruce(at)momjian(dot)us>, Dimitri Fontaine <dim(at)tapoueh(dot)org>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Subject: Re: Built-in connection pooler
Date: 2021-04-28 10:14:56
Message-ID: 1915.1619604896@antos
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Konstantin Knizhnik <knizhnik(at)garret(dot)ru> wrote:

> People asked me to resubmit built-in connection pooler patch to commitfest.
> Rebased version of connection pooler is attached.

I've reviewd the patch but haven't read the entire thread thoroughly. I hope
that I don't duplicate many comments posted earlier by others.

(Please note that the patch does not apply to the current master, I had to
reset the head of my repository to the appropriate commit.)

Documentation / user interface

* session_pool_size (config.sgml)

I wonder if

"The default value is 10, so up to 10 backends will serve each database,"

should rather be

"The default value is 10, so up to 10 backends will serve each database/user combination."

However, when I read the code, I think that each proxy counts the size of the
pool separately, so the total number of backends used for particular
database/user combination seems to be

session_pool_size * connection_proxies

Since the feature uses shared memory statistics anyway, shouldn't it only
count the total number of backends per database/user? It would need some
locking, but the actual pools (hash tables) could still be local to the proxy

* connection_proxies

(I've noticed that Ryan Lambert questioned this variable upthread.)

I think this variable makes the configuration less straightforward from the
user perspective. Cannot the server launch additional proxies dynamically, as
needed, e.g. based on the shared memory statistics that the patch introduces?
I see that postmaster would have to send the sockets in a different way. How
about adding a "proxy launcher" process that would take care of the scheduling
and launching new proxies?

* multitenant_proxy

I thought the purpose of this setting is to reduce the number of backends
needed, but could not find evidence in the code. In particular,
client_attach() always retrieves the backend from the appropriate pool, and
backend_reschedule() does so as well. Thus the role of both client and backend
should always match. What piece of information do I miss?

* typo (2 occurrences in config.sgml)

"stanalone" -> "standalone"

Design / coding

* proxy.c:backend_start() does not change the value of the "host" parameter to
the socket directory, so I assume the proxy connects to the backend via TCP
protocol. I think the unix socket should be preferred for this connection if
the platform has it, however:

* is libpq necessary for the proxy to connect to backend at all? Maybe
postgres.c:ReadCommand() can be adjusted so that the backend can communicate
with the proxy just via the plain socket.

I don't like the idea of server components communicating via libpq (do we
need anything else of the libpq connection than the socket?) as such, but
especially these includes in proxy.c look weird:

#include "../interfaces/libpq/libpq-fe.h"
#include "../interfaces/libpq/libpq-int.h"

* How does the proxy recognize connections to the walsender? I haven't tested
that, but it's obvious that these connections should not be proxied.

* ConnectionProxyState is in shared memory, so access to its fields should be

* StartConnectionProxies() is only called from PostmasterMain(), so I'm not
sure the proxies get restarted after crash. Perhaps PostmasterStateMachine()
needs to call it too after calling StartupDataBase().

* Why do you need the Channel.magic integer field? Wouldn't a boolean field
"active" be sufficient?

** In proxy_loop(), I've noticed tests (chan->magic == ACTIVE_CHANNEL_MAGIC)
tests inside the branch

else if (chan->magic == ACTIVE_CHANNEL_MAGIC)

Since neither channel_write() nor channel_read() seem to change the
value, I think those tests are not necessary.

* Comment lines are often too long.

* pgindent should be applied to the patch at some point.

I can spend more time reviewing the patch during the next CF.

Antonin Houska

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Aleksander Alekseev 2021-04-28 10:19:36 Re: Some oversights in query_id calculation
Previous Message Aleksander Alekseev 2021-04-28 09:44:45 Re: [PATCH] We install pg_regress and isolationtester but not pg_isolation_regress