Re: Built-in connection pooler

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Bruce Momjian <bruce(at)momjian(dot)us>, Dimitri Fontaine <dim(at)tapoueh(dot)org>
Subject: Re: Built-in connection pooler
Date: 2019-07-14 22:48:13
Message-ID: CA+hUKGJPZa5pTGRLxD1DUcDHihB_5hXV3rjsrnnb4RZu1P=ukQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jul 15, 2019 at 7:56 AM Konstantin Knizhnik
<k(dot)knizhnik(at)postgrespro(dot)ru> wrote:
> Can you please explain me more precisely how to reproduce the problem
> (how to configure postgres and how to connect to it)?

Maybe it's just that postmaster.c's ConnCreate() always allocates a
struct for port->gss if the feature is enabled, but the equivalent
code in proxy.c's proxy_loop() doesn't?

./configure
--prefix=$HOME/install/postgres \
--enable-cassert \
--enable-debug \
--enable-depend \
--with-gssapi \
--with-icu \
--with-pam \
--with-ldap \
--with-openssl \
--enable-tap-tests \
--with-includes="/opt/local/include" \
--with-libraries="/opt/local/lib" \
CC="ccache cc" CFLAGS="-O0"

~/install/postgres/bin/psql postgres -h localhost -p 6543

2019-07-15 08:37:25.348 NZST [97972] LOG: server process (PID 97974)
was terminated by signal 11: Segmentation fault: 11

(lldb) bt
* thread #1, stop reason = signal SIGSTOP
* frame #0: 0x0000000104163e7e
postgres`secure_read(port=0x00007fda9ef001d0, ptr=0x00000001047ab690,
len=8192) at be-secure.c:164:6
frame #1: 0x000000010417760d postgres`pq_recvbuf at pqcomm.c:969:7
frame #2: 0x00000001041779ed postgres`pq_getbytes(s="??\x04~?,
len=1) at pqcomm.c:1110:8
frame #3: 0x0000000104284147
postgres`ProcessStartupPacket(port=0x00007fda9ef001d0,
secure_done=true) at postmaster.c:2064:6
...
(lldb) f 0
frame #0: 0x0000000104163e7e
postgres`secure_read(port=0x00007fda9ef001d0, ptr=0x00000001047ab690,
len=8192) at be-secure.c:164:6
161 else
162 #endif
163 #ifdef ENABLE_GSS
-> 164 if (port->gss->enc)
165 {
166 n = be_gssapi_read(port, ptr, len);
167 waitfor = WL_SOCKET_READABLE;
(lldb) print port->gss
(pg_gssinfo *) $0 = 0x0000000000000000

> > Obviously your concept of tainted backends (= backends that can't be
> > reused by other connections because they contain non-default session
> > state) is quite simplistic and would help only the very simplest use
> > cases. Obviously the problems that need to be solved first to do
> > better than that are quite large. Personally I think we should move
> > all GUCs into the Session struct, put the Session struct into shared
> > memory, and then figure out how to put things like prepared plans into
> > something like Ideriha-san's experimental shared memory context so
> > that they also can be accessed by any process, and then we'll mostly
> > be tackling problems that we'll have to tackle for threads too. But I
> > think you made the right choice to experiment with just reusing the
> > backends that have no state like that.
>
> That was not actually my idea: it was proposed by Dimitri Fontaine.
> In PgPRO-EE we have another version of builtin connection pooler
> which maintains session context and allows to use GUCs, prepared statements
> and temporary tables in pooled sessions.

Interesting. Do you serialise/deserialise plans and GUCs using
machinery similar to parallel query, and did you changed temporary
tables to use shared buffers? People have suggested that we do that
to allow temporary tables in parallel queries too. FWIW I have also
wondered about per (database, user) pools for reusable parallel
workers.

> But the main idea of this patch was to make connection pooler less
> invasive and
> minimize changes in Postgres core. This is why I have implemented proxy.

How do you do it without a proxy?

One idea I've wondered about that doesn't involve feeding all network
IO through an extra process and extra layer of syscalls like this is
to figure out how to give back your PGPROC slot when idle. If you
don't have one and can't acquire one at the beginning of a
transaction, you wait until one becomes free. When idle and not in a
transaction you give it back to the pool, perhaps after a slight
delay. That may be impossible for some reason or other, I don't know.
If it could be done, it'd deal with the size-of-procarray problem
(since we like to scan it) and provide a kind of 'admission control'
(limiting number of transactions that can run), but wouldn't deal with
the amount-of-backend-memory-wasted-on-per-process-caches problem
(maybe solvable by shared caching).

Ok, time for a little bit more testing. I was curious about the user
experience when there are no free backends.

1. I tested with connection_proxies=1, max_connections=4, and I began
3 transactions. Then I tried to make a 4th connection, and it blocked
while trying to connect and the log shows a 'sorry, too many clients
already' message. Then I committed one of my transactions and the 4th
connection was allowed to proceed.

2. I tried again, this time with 4 already existing connections and
no transactions. I began 3 concurrent transactions, and then when I
tried to begin a 4th transaction the BEGIN command blocked until one
of the other transactions committed.

Some thoughts: Why should case 1 block? Shouldn't I be allowed to
connect, even though I can't begin a transaction without waiting yet?
Why can I run only 3 transactions when I said max_connection=4? Ah,
that's probably because the proxy itself is eating one slot, and
indeed if I set connection_proxies to 2 I can now run only two
concurrent transactions. And yet when there were no transactions
running I could still open 4 connections. Hmm.

The general behaviour of waiting instead of raising an error seems
sensible, and that's how client-side connection pools usually work.
Perhaps some of the people who have wanted admission control were
thinking of doing it at the level of queries rather than whole
transactions though, I don't know. I suppose in extreme cases it's
possible to create invisible deadlocks, if a client is trying to
control more than one transaction concurrently, but that doesn't seem
like a real problem.

> > On my FreeBSD box (which doesn't have epoll(), so it's latch.c's old
> > school poll() for now), I see the connection proxy process eating a
> > lot of CPU and the temperature rising. I see with truss that it's
> > doing this as fast as it can:
> >
> > poll({ 13/POLLIN 17/POLLIN|POLLOUT },2,1000) = 1 (0x1)
> >
> > Ouch. I admit that I had the idea to test on FreeBSD because I
> > noticed the patch introduces EPOLLET and I figured this might have
> > been tested only on Linux. FWIW the same happens on a Mac.
>
> Yehh.
> This is really the problem. In addition to FreeBSD and MacOS, it also
> takes place at Win32.
> I have to think more how to solve it. We should somehow emulate
> "edge-triggered" more for this system.
> The source of the problem is that proxy is reading data from one socket
> and writing it in another socket.
> If write socket is busy, we have to wait until is is available. But at
> the same time there can be data available for input,
> so poll will return immediately unless we remove read event for this
> socket. Not here how to better do it without changing
> WaitEvenSet API.

Can't you do this by removing events you're not interested in right
now, using ModifyWaitEvent() to change between WL_SOCKET_WRITEABLE and
WL_SOCKET_READABLE as appropriate? Perhaps the problem you're worried
about is that this generates extra system calls in the epoll()
implementation? I think that's not a problem for poll(), and could be
fixed for the kqueue() implementation I plan to commit eventually. I
have no clue for Windows.

> Looks like proxy.c has to be moved somewhere outside core?
> May be make it an extension? But it may be not so easy to implement because
> proxy has to be tightly integrated with postmaster.

I'm not sure. Seems like it should be solvable with the code in core.

--
Thomas Munro
https://enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2019-07-14 23:46:14 Re: Change ereport level for QueuePartitionConstraintValidation
Previous Message Matheus de Oliveira 2019-07-14 20:04:15 Re: [PATCH] Add support for ON UPDATE/DELETE actions on ALTER CONSTRAINT