Quick Links

Re: Feature Proposal: Connection Pool Optimization - Change the Connection User

From:	Todd Hubers <todd(dot)hubers(at)gmail(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, x4mmm(at)yandex-team(dot)ru
Cc:	pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject:	Re: Feature Proposal: Connection Pool Optimization - Change the Connection User
Date:	2021-11-21 02:05:16
Message-ID:	CABO3BC2XW24yLGAsBNQzEbJUx7JbSojWAUKUmKs79A3YOPa-tQ@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hi Tom, Justin, and Andrey,

Thanks everybody for your feedback so far! I agree, there are a few
unknowns for the design and impact and there are many details to iron out.

*Benchmarking* - Overall I think it's best to explore improvements with
benchmarking. The key goal of this proposal pertains to performance:
"fast". That means benchmarking is essential in the design phase, to ensure
there are measurable improvements, and reward for effort. This should also
primarily influence the selection of Solution Option (of which there are 6
or 7 listed). *I have added the proposed benchmarks to the document.
<https://docs.google.com/document/d/1u6mVKEHfKtR80UrMLNYrp5D6cCSW1_arcTaZ9HcAKlw/edit?usp=sharing>*

*1) Andrey said*: "And these expenses would be not necessary *if we could
just send a new Startup message after the Terminate ('X') message*." -
*Answer*: I have added the "Terminate/Restartup" *benchmark in the document
*accordingly.

*2) Justin said:* "However, this doesn't attempt to address the similar
issue *if there's hundreds or thousands of DBs*. Which I don't think could
work at all, since switching to a new DB requires connecting to a new
backend process."

Good point. I do have some loose thinking around that. I do have some
comments throughout the proposal document to *tackle that in the future*.
As you point out, user switching seems to be a simpler well-trodden path.
For now, my proposal is that the middleware can assert a database change,
but that should create an error for now. In the future, it might be a
supported capability.

I have also added database-switching as a consideration for the
*benchmarking*.

*3) Justin said*: "You'd need to make sure that a client connected to the
connection pooler cannot itself run the PQ "set role". *The connection
pooler would need to reject the request *(or maybe ignore requests to
switch to the same/current user)." - *Answer: Agreed*.

*4) Justin said:* "Maybe you'd have two protocol requests "begin switch
user" and "end switch to user", and then the server-side could enforce that
"begin switch" is not nested. Maybe the "begin switch" could return some
kind of "nonce" to the connection pooler, and "end switch" *would require
the same nonce* to be validated."

Agreed, something like this should be suitable. For performance, *I would
prefer to not use a Nonce* because that would not be a 0-RTT approach. The
PQ level ImpersonateDatabaseUser is effectively the "begin". I don't think
an "end" is necessary, because it is expected to switch the user very
often. The next "begin" is effectively ending the last Impersonation. The
middleware could switch back to its own username, but as per my draft
proposal, that user should only be allowed (as a connection bound
authorisation for ImpersonateDatabaseUser) to impersonate and nothing else.

And then there are yet 5 other solution options which have an explicit
Begin/End pattern.

*5) Justin said:* "It'd be interesting to hear if you've tested with
postgresql 14, which improves scalability to larger number of connections."

Agreed. I will include a baseline in the benchmarks. Regardless of
improvements, it won't be infinite and there will still be a need to
"enable" pooling where there are MAX_CONNECTIONS users using the system.

*6) Tom said:* "It's not really apparent to me how this mechanism wouldn't
be a giant security hole."

I have added your important concerns to a Security Considerations section
of the document:

- "What is the difference between granting the proposed "impersonate"
privilege and granting superuser?"
- "Is this merely transferring the entire responsibility for user
authentication onto the middleware, with the server expected to just
believe that the session should now be allowed to act as user X?"

*7) Tom said:* "I think by the time you got to something where
"impersonate" wasn't a security hole,* it would be pretty much equivalent
to SET ROLE*"

*I don't think so*. This is already contemplated in the draft. See Solution
Option 4. SET ROLE comes with RESET ROLE. The frontend client could call
RESET ROLE then change to whatever role they like. That means that feature
is not currently suitable to be used for the context of this proposal.

*8) Tom said:* "Another issue is that the proposed implementation seems
pretty far short of being an *invisible substitute for actually logging in*
as the target user"

*The goal* of this proposal is performance and *to enable shared connection
pooling*. Direct logging in doesn't allow the reuse of existing connections
in the pool.

*9) Tom said:* "For starters, what of *instantiating ALTER ROLE SET*
parameter values that apply to the target user, and getting rid of such
settings that applied to the original user ID?"

I think you are suggesting to modify the already-logged-in user. That's
interesting. However, *many systems audit the logged in user by username* -
who they are. Furthermore, the modification of user privileges would not
help to enable connection pooling as rehashed in my answer to [8].

*10) Tom said:* "How would this interact with the "on login" triggers that
people keep asking for?

That's a good point. I would imagine that SET ROLE (which is currently
unsuitable) would have the same requirement. The answer is *Shared
Functions*. SET ROLE calls a function like "*SetSessionUserId*". Our
implementation should call the same function(s). If OnLogin functionality
is implemented they should trigger from there.

*11) Tom said:* "Also, you'd still have to do DISCARD ALL (at least) when
switching users, or else integrate all that cache-flushing right into the
switching mechanism. So *I'm not entirely convinced about* how big
the *performance
benefits *are compared to a fresh session."

Agreed, nor am I convinced. We should only be guided by benchmarks and
tests, not subjective assumptions. *See the Benchmarking section* in the
document for details.

*12 Tom said:* "...[Upon failure, no further commands will be processed
until ImpersonateDatabaseUser succeeds.] *seems to require adding a huge
amount of complication on the server side*, and complication in the
protocol spec itself, to save a rather minimal amount of complication in
the middleware. Why can't we just say that a failed "impersonate" command
leaves the session in the same state as before, and it's up to the pooler
to do something about it? We are in any case trusting the pooler not to
send commands from user A to a session logged in as user B. We are in any
case trusting the pooler not to send commands from user A to a session
logged in as user B."

*I think you are overstating the complexity*. It only requires a
LastUserSwitchFailed boolean which is cleared to false when a UserSwitch
succeeds. If LastUserSwitchFailed is true, tcop ignores the messages and
sends back errors. This detail has been added to the proposal document.

*It's important that the implementation is objectively faster*. The 0-RTT
design is proposed for efficiency. The Middleware might be able to fit BOTH
the UserSwitch AND the Query within a 1500 MTU. If not, it shouldn't wait
for a confirmation - for efficiency. The middleware might be on localhost,
or it might be 1-5ms away on the LAN. Effectively, the UserSwitch is a sort
of "Header" before a series of commands. Performance is the goal.
Therefore, the connection cannot be left in the same state as before, or
else the pending Query will run in the incorrect context. This is a rare
failure mode, so failure is ideal.

Ultimately these are assumptions and *benchmarking results should drive
decisions* around the implementation of every aspect.

*13. Tom said:* "I wonder how we test such a feature meaningfully *without
incorporating a pooler right into the Postgres tree*."

*We can benchmark without a pooler* - see the Benchmark section for details
*.* (Furthermore, I propose that general benchmark tooling does belong in
Postgres for the benefit of the ecosystem of connection poolers. I have
included such a remark in the Benchmarking section "PostgreSQL is not
planning to incorporate Connection Pooling...".)

Thanks again everyone for the tough questions and the ideas!

Regards,

Todd

On Sun, 21 Nov 2021 at 08:16, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Justin Pryzby <pryzby(at)telsasoft(dot)com> writes:
> > On Sun, Nov 21, 2021 at 03:11:03AM +1100, Todd Hubers wrote:
> >> - Google Document with Commenting turned on
> >>
> https://docs.google.com/document/d/1u6mVKEHfKtR80UrMLNYrp5D6cCSW1_arcTaZ9HcAKlw/edit?usp=sharing
> .
>
> > You proposed a PQ protocol version of SET ROLE/SET SESSION authorization.
> > You'd need to make sure that a client connected to the connection pooler
> cannot
> > itself run the PQ "set role".
>
> It's not really apparent to me how this mechanism wouldn't be a giant
> security hole. In particular, I see no meaningful difference between
> granting the proposed "impersonate" privilege and granting superuser.
> You could restrict it to not allow impersonating superusers, which'd
> make it a little better, but what of all the predefined roles we keep
> inventing that have privileges we don't want to be accessible to Joe
> User? I think by the time you got to something where "impersonate"
> wasn't a security hole, it would be pretty much equivalent to SET ROLE,
> ie you could only impersonate users you'd been specifically given the
> privlege for.
>
> It also seems like this is transferring the entire responsibility for
> user authentication onto the middleware, with the server expected to
> just believe that the session should now be allowed to act as user X.
> Again, that seems more like a security problem than a good design.
>
> Another issue is that the proposed implementation seems pretty far
> short of being an invisible substitute for actually logging in as
> the target user. For starters, what of instantiating ALTER ROLE SET
> parameter values that apply to the target user, and getting rid of
> such settings that applied to the original user ID? How would this
> interact with the "on login" triggers that people keep asking for?
>
> Also, you'd still have to do DISCARD ALL (at least) when switching
> users, or else integrate all that cache-flushing right into the
> switching mechanism. So I'm not entirely convinced about how big
> the performance benefits are compared to a fresh session.
>
> One more point is that the proposed business about
>
> * ImpersonateDatabaseUser will either succeed silently (0-RTT), or
> fail. Upon failure, no further commands will be processed until
> ImpersonateDatabaseUser succeeds.
>
> seems to require adding a huge amount of complication on the server side,
> and complication in the protocol spec itself, to save a rather minimal
> amount of complication in the middleware. Why can't we just say that
> a failed "impersonate" command leaves the session in the same state
> as before, and it's up to the pooler to do something about it? We are
> in any case trusting the pooler not to send commands from user A to
> a session logged in as user B.
>
> regards, tom lane
>
> PS: I wonder how we test such a feature meaningfully without
> incorporating a pooler right into the Postgres tree. I don't
> want to do that, for sure.
>

--
--
Todd Hubers

In response to

Re: Feature Proposal: Connection Pool Optimization - Change the Connection User at 2021-11-20 21:16:45 from Tom Lane

Responses

Re: Feature Proposal: Connection Pool Optimization - Change the Connection User at 2021-11-22 10:22:05 from Daniel Gustafsson

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Bharath Rupireddy	2021-11-21 03:15:52	Re: logical decoding/replication: new functions pg_ls_logicaldir and pg_ls_replslotdir
Previous Message	Paul A Jungwirth	2021-11-21 01:51:16	Re: SQL:2011 application time