Re: Moving forward with TDE

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Chris Travers <chris(dot)travers(at)gmail(dot)com>, David Christensen <david+pg(at)pgguru(dot)net>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Moving forward with TDE
Date: 2023-03-28 02:56:58
Message-ID: CAOuzzgqoWfw47-hBt0v066y6yB8e+axD3N6AQtuzS5FZ8Tqdkw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Greetings,

On Mon, Mar 27, 2023 at 21:35 Bruce Momjian <bruce(at)momjian(dot)us> wrote:

> On Tue, Mar 28, 2023 at 02:03:50AM +0200, Stephen Frost wrote:
> > The remote storage is certainly an independent system. Multi-mount LUNs
> are
> > entirely possible in a SAN (and absolutely with NFS, or just the NFS
> server
> > itself is compromised..), so while the attacker may not have any access
> to the
> > database server itself, they may have access to these other systems, and
> that’s
> > not even considering in-transit attacks which are also absolutely
> possible,
> > especially with iSCSI or NFS.
> >
> > I don’t understand what is being claimed that the remote storage is “not
> an
> > independent system” based on my understanding of, eg, NFS. With NFS, a
> > directory on the NFS server is exported and the client mounts that
> directory as
> > NFS locally, all over a network which may or may not be secured against
> > manipulation. A user on the NFS server with root access is absolutely
> able to
> > access and modify files on the NFS server trivially, even if they have no
> > access to the PG server. Would you explain what you mean?
>
> The point is that someone could change values in the storage, pg_xact,
> encryption settings, binaries, that would allow the attacker to learn
> the encryption key. This is not possible for two secure endpoints and
> someone changing data in transit. Yeah, it took me a while to
> understand these boundaries too.

This depends on the specific configuration of the systems, clearly. Being
able to change values in other parts of the system isn’t great and we
should work to improve on that, but clearly that isn’t so much of an issue
that people aren’t willing to accept a partial solution or existing
commercial solutions wouldn’t be accepted or considered viable. Indeed,
using GCM is objectively an improvement over what’s being offered commonly
today.

I also generally object to the idea that being able to manipulate the
PGDATA directory necessarily means being able to gain access to the KEK. In
trivial solutions, sure, it’s possible, but the NFS server should never be
asking some external KMS for the key to a given DB server and a reasonable
implementation won’t allow this, and instead would flag and log such an
attempt for someone to review, leading to a much faster realization of a
compromised system.

Certainly it’s much simpler to reason about an attacker with no knowledge
of either system and only network access to see if they can penetrate the
communications between the two end-points, but that is not the only case
where authenticated encryption is useful.

> So the idea is that the backup user can be compromised without the
> data
> > being vulnerable --- makes sense, though that use-case seems narrow.
> >
> > That’s perhaps a fair consideration- but it’s clearly of enough value
> that many
> > of our users are asking for it and not using PG because we don’t have it
> today.
> > Ultimately though, this clearly makes it more than a “checkbox” feature.
> I hope
> > we are able to agree on that now.
>
> It is more than a check box feature, yes, but I am guessing few people
> are wanting the this for the actual features beyond check box.

As I explained previously, perhaps the people asking are doing so for only
the “checkbox”, but that doesn’t mean it isn’t a useful feature or that it
isn’t valuable in its own right. Those checklists were compiled and
enforced for a reason, which the end users might not understand but is
still absolutely valuable. Sad to say, but frankly this is becoming more
and more common but we shouldn’t be faulting the users asking for it- if it
were truly useless then eventually it would be removed from the standard,
but it hasn’t and it won’t be because, while not every end user has a depth
of understanding to explain it, it is actually a useful and important
capability to have and one that is important to implement.

> Yes, there is value beyond the check-box, but in most cases those
> > values are limited considering the complexity of the features, and
> the
> > check-box is what most people are asking for, I think.
> >
> > For the users who ask on the lists for this feature, regularly, how many
> don’t
> > ask because they google or find prior responses on the list to the
> question of
> > if we have this capability? How do we know that their cases are
> “checkbox”?
>
> Because I have rarely heard people articulate the value beyond check
> box.

Have I done so sufficiently then that we can agree that calling it
“checkbox” is inappropriate and detrimental to our user base?

> Consider that there are standards groups which explicitly consider these
> attack
> > vectors and consider them important enough to require mitigations to
> address
> > those vectors. Do the end users of PG understand the attack vectors or
> why they
> > matter? Perhaps not, but just because they can’t articulate the
> reasoning does
> > NOT mean that the attack vector doesn’t exist or that their environment
> is
> > somehow immune to it- indeed, as the standards bodies surely know, the
> opposite
> > is true- they’re almost certainly at risk of those attack vectors and
> therefore
> > the standards bodies are absolutely justified in requiring them to
> provide a
> > solution. Treating these users as unimportant because they don’t have
> the depth
> > of understanding that we do or that the standards body does is not
> helping
> > them- it’s actively driving them away from PG.
>
> Well, then who is going to explain them here, because I have not heard
> them yet.

I thought I was doing so.

> The RLS arguments were that queries could expoose some of the
> underlying
> > data, but in summary, that was considered acceptable.
> >
> > This is an excellent point- and dovetails very nicely into my argument
> that
> > protecting primary data (what is provided by users and ends up in
> indexes and
> > heaps) is valuable even if we don’t (yet..) have protection for other
> parts of
> > the system. Reducing the size of the attack vector is absolutely useful,
> > especially when it’s such a large amount of the data in the system. Yes,
> we
> > should, and will, continue to improve- as we do with many features, but
> we
> > don’t need to wait for perfection to include this feature, just as with
> RLS and
> > numerous other features we have.
>
> The issue is that you needed a certain type of user with a certain type
> of access to break RLS, while for this, writing to PGDATA is the simple
> case for all the breakage, and the thing we are protecting with
> authentication.

This goes back to the “if it isn’t perfect then it’s useless” argument …
but that’s exactly the discussion which was had around RLS and ultimately
we decided that RLS was still useful even with the leaks- and our users
accepted that also and have benefitted from it ever since it was included
in core. The same exists here- yes, more needs to be done than the absolute
simplest “make install” to have the system be secure (not unlike today with
our defaults from a source build with “make install”..) but at least with
this capability included it’s possible, and we can write “securing
PostgreSQL” documentation on how to, whereas without it there is simply no
way to address the attack vectors I’ve articulated here.

> > > We, as a community, are clearly losing value by lack of this
> > capability,
> > > if by
> > > > no other measure than simply the numerous users of the
> commercial
> > > > implementations feeling that they simply can’t use PG
> without this
> > > feature, for
> > > > whatever their reasoning.
> > >
> > > That is true, but I go back to my concern over useful feature
> vs.
> > check
> > > box.
> > >
> > > While it’s easy to label something as checkbox, I don’t feel we
> have been
> > fair
> >
> > No, actually, it isn't. I am not sure why you are saying that.
> >
> > I’m confused as to what is required to label a feature as a “checkbox”
> feature
> > then. What did you us to make that determination of this feature? I’m
> happy to
> > be wrong here.
>
> I don't see the point in me continuing to reply here. You just seem to
> continue asking questions without actually thinking of what I am saying,
> and hope I get tired or something.

I hope we have others who have a moment to chime in here and provide their
viewpoints as I don’t feel this is an accurate representation of the
discussion thus far.

Thanks,

Stephen

>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Hayato Kuroda (Fujitsu) 2023-03-28 03:04:47 RE: PGdoc: add missing ID attribute to create_subscription.sgml
Previous Message Bharath Rupireddy 2023-03-28 02:47:30 Re: Add pg_walinspect function with block info columns