Re: postgres_fdw, dblink, and CREATE SUBSCRIPTION security

From: Jacob Champion <jchampion(at)timescale(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: postgres_fdw, dblink, and CREATE SUBSCRIPTION security
Date: 2023-03-24 21:47:40
Message-ID: 29c017e5-a84d-3ff7-c049-1cc0c7ccbb4d@timescale.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 3/20/23 09:32, Robert Haas wrote:
> I think this is the root of our disagreement.

Agreed.

> My understanding of the
> previous discussion is that people think that the major problem here
> is the wraparound-to-superuser attack. That is, in general, we expect
> that when we connect to a database over the network, we expect it to
> do some kind of active authentication, like asking us for a password,
> or asking us for an SSL certificate that isn't just lying around for
> anyone to use. However, in the specific case of a local connection, we
> have a reliable way of knowing who the remote user is without any kind
> of active authentication, namely 'peer' authentication or perhaps even
> 'trust' if we trust all the local users, and so we don't judge it
> unreasonable to allow local connections without any form of active
> authentication. There can be some scenarios where even over a network
> we can know the identity of the person connecting with complete
> certainty, e.g. if endpoints are locked down such that the source IP
> address is a reliable indicator of who is initiating the connection,
> but in general when there's a network involved you don't know who the
> person making the connection is and need to do something extra to
> figure it out.

Okay, but this is walking back from the network example you just
described upthread. Do you still consider that in scope, or...?

> If you accept this characterization of the problem,

I'm not going to say yes or no just yet, because I don't understand your
rationale for where to draw the lines.

If you just want the bare minimum thing that will solve the localhost
case, require_auth landed this week. Login triggers are not yet a thing,
so `require_auth=password,md5,scram-sha-256` ensures active
authentication. You don't even have to disallow localhost connections,
as far as I can tell; they'll work as intended.

If you think login triggers will get in for PG16, my bigger proposal
can't help in time. But if you're drawing the line at "environmental
HBAs are fundamentally unsafe and you shouldn't use them if you have a
proxy," why can't I instead draw the line at "login triggers are
fundamentally unsafe and you shouldn't use them if you have a proxy"?

And if you want to handle the across-the-network case, too, then I don't
accept the characterization of the problem.

> then I don't think
> the oracle is that hard to design. We simply set it up not to allow
> wraparound connections, or maybe even more narrowly to not allow
> wraparound connections to superuser. If the DBA has some weird network
> topology where that's not the correct rule, either because they want
> to allow wraparound connections or they want to disallow other things,
> then yeah they have to tell us what to allow, but I don't really see
> why that's an unreasonable expectation.

This seems like a security model that has been carefully gerrymandered
around the existing implementation. My argument is that the "weird
network topology" isn't weird at all, and it's only dangerous because of
decisions we made (and can unmake).

I feel pretty strongly that the design arrow needs to be pointed in the
opposite direction. The model needs to be chosen first, to prevent us
from saying, "We defend against whatever the implementation lets us
defend against today. Good luck, DBAs."

> If machines B and C aren't under our control such that we can
> configure them that way, then the configuration is fundamentally
> insecure in a way that we can't really fix.

Here's probably our biggest point of contention. You're unlikely to
convince me that this is the DBA's fault.

If machines B and C aren't under our control, then our *protocol* is
fundamentally insecure in a way that we have the ability to fix, in a
way that's already been characterized in security literature.

> I think that what you're proposing is that B and C can just be allowed
> to proxy to A and A can say "hey, by the way, I'm just gonna let you
> in without asking for anything else" and B and C can, when proxying,
> react to that by disconnecting before the connection actually goes
> through. That's simpler, in a sense. It doesn't require us to set up
> the proxy configuration on B and C in a way that matches what
> pg_hba.conf allows on A. Instead, B and C can automatically deduce
> what connections they should refuse to proxy.

Right. It's meant to take the "localhost/wraparound connection" out of a
class of special things we have to worry about, and make it completely
boring.

> I guess that's nice, but
> it feels pretty magical to me. It encourages the DBA not to think
> about what B and C should actually be allowed to proxy, and instead
> just trust that the automatics are going to prevent any security
> disasters.

I agree magical behavior is dangerous, if what you think it can do
doesn't match up with what it can actually do. Bugs are always possible,
and maybe I'm just not seeing a corner case yet, because I'm talking too
much and not coding it -- but is this really a case where I'm
overpromising? Or does it just feel magical because it's meant to fix
the root issue?

(Remember, I'm not arguing against your proxy filter; I just want both.
They complement each other.)

> I'm not sure that they always will, and I fear cultivating
> too much reliance on them.

I can't really argue against this... but I'm not really sure anyone could.

My strawman rephrasing of that is, "we have to make the feature crappy
enough that we can blame the DBA when things go wrong." And even that
strawman could be perfectly reasonable, in situations where the DBA
necessarily has more information than the machine. In this case, though,
it seems to me that the two machines have all the information necessary
to make a correct decision between them.

Thanks!
--Jacob

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Cary Huang 2023-03-24 22:17:33 Re: pgbench - adding pl/pgsql versions of tests
Previous Message Peter Geoghegan 2023-03-24 21:27:53 Re: Should we remove vacuum_defer_cleanup_age?