Re: Built-in connection pooler

From: Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: Ryan Lambert <ryan(at)rustprooflabs(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Bruce Momjian <bruce(at)momjian(dot)us>, Dimitri Fontaine <dim(at)tapoueh(dot)org>
Subject: Re: Built-in connection pooler
Date: 2019-07-30 10:01:48
Message-ID: 22e10415-abd3-c75d-6b70-f8789c7b2db8@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 30.07.2019 4:02, Tomas Vondra wrote:
>
> My idea (sorry if it wasn't too clear) was that we might handle some
> cases more gracefully.
>
> For example, if we only switch between transactions, we don't quite care
> about 'SET LOCAL' (but the current patch does set the tainted flag). The
> same thing applies to GUCs set for a function.
> For prepared statements, we might count the number of statements we
> prepared and deallocated, and treat it as 'not tained' when there are no
> statements. Maybe there's some risk I can't think of.
>
> The same thing applies to temporary tables - if you create and drop a
> temporary table, is there a reason to still treat the session as tained?
>
>

I already handling temporary tables with transaction scope (created
using "create temp table ... on commit drop" command) - backend is not
marked as tainted in this case.
Thank you for your notice about "set local" command - attached patch is
also handling such GUCs.

>> To implement prepared statements  we need to store them in session
>> context or at least add some session specific prefix to prepare
>> statement name.
>> Temporary tables also require per-session temporary table space. With
>> GUCs situation is even more complicated - actually most of the time
>> in my PgPro-EE pooler version
>> I have spent in the fight with GUCs (default values, reloading
>> configuration, memory alllocation/deallocation,...).
>> But the "show stopper" are temporary tables: if them are accessed
>> through internal (non-shared buffer), then you can not reschedule
>> session to some other backend.
>> This is why I have now patch with implementation of global temporary
>> tables (a-la Oracle) which has global metadata and are accessed
>> though shared buffers (which also allows to use them
>> in parallel queries).
>>
>
> Yeah, temporary tables are messy. Global temporary tables would be nice,
> not just because of this, but also because of catalog bloat.
>

Global temp tables solves two problems:
1. catalog bloating
2. parallel query execution.

Them are not solving problem with using temporary tables at replica.
May be this problem can be solved by implementing special table access
method for temporary tables.
But I am still no sure how useful will be such implementation of special
table access method for temporary tables.
Obviously it requires much more efforts (need to reimplement a lot of
heapam stuff).
But it will allow to eliminate MVCC overhead for temporary tuple and may
be also reduce space by reducing size of tuple header.

>
>> If Postgres backend is able to work only with on database, then you
>> will have to start at least such number of backends as number of
>> databases you have.
>> Situation with users is more obscure - it may be possible to
>> implement multiuser access to the same backend (as it can be done now
>> using "set role").
>>
>
> I don't think I've said we need anything like that. The way I'd expect
> it to work that when we run out of backend connections, we terminate
> some existing ones (and then fork new backends).

I afraid that it may eliminate most of positive effect of session
pooling if we will  terminate and launch new backends without any
attempt to bind backends to database and reuse them.

>
>> So I am not sure that if we implement sophisticated configurator
>> which allows to specify in some configuration file for each
>> database/role pair maximal/optimal number
>> of workers, then it completely eliminate the problem with multiple
>> session pools.
>>
>
> Why would we need to invent any sophisticated configurator? Why couldn't
> we use some version of what pgbouncer already does, or maybe integrate
> it somehow into pg_hba.conf?

I didn't think about such possibility.
But I suspect many problems with reusing pgbouncer code and moving it to
Postgres core.

> I also agree that more monitoring facilities are needed.
>> Just want to get better understanding what kind of information we
>> need to monitor.
>> As far as pooler is done at transaction level, all non-active session
>> are in idle state
>> and state of active sessions can be inspected using pg_stat_activity.
>>
>
> Except when sessions are tainted, for example. And when the transactions
> are long-running, it's still useful to list the connections.
>
Tainted backends are very similar with normal postgres backends.
The only difference is that them are still connected with client though
proxy.
What I wanted to say is that pg_stat_activity will show you information
about all active transactions
even in case of connection polling.  You will no get information about
pended sessions, waiting for
idle backends. But such session do not have any state (transaction is
not started yet). So there is no much useful information
we can show about them except just number of such pended sessions.

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment Content-Type Size
builtin_connection_proxy-14.patch text/x-patch 137.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Wh isere 2019-07-30 10:22:50 Re: query1 followed by query2 at maximum distance vs current fixed distance
Previous Message Etsuro Fujita 2019-07-30 09:00:33 Re: [HACKERS] advanced partition matching algorithm for partition-wise join