Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Joe Conway <mail(at)joeconway(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Antonin Houska <ah(at)cybertec(dot)at>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, "Moon, Insung" <Moon_Insung_i3(at)lab(dot)ntt(dot)co(dot)jp>, Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)
Date: 2019-07-26 00:07:28
Message-ID: 20190726000727.GT29202@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Greetings,

* Bruce Momjian (bruce(at)momjian(dot)us) wrote:
> On Thu, Jul 25, 2019 at 07:41:14PM -0400, Stephen Frost wrote:
> > > You are right that we can allow it online, but we haven't been
> > > discussing these cases since it is easy to do this because we have
> > > access to the keys. I do think whatever code we use for checksum online
> > > changes will be used for encryption online changes. We would need a
> > > per-page bit to indicate encryption, hopefully in the first 16 bytes.
> >
> > Arranging to have an individual table move from being plain to
> > encrypted is something that would be nice to support in an online and
> > non-blocking manner, but *that* is a bunch of additional work that we
> > don't need to do today as opposed to being something that's just part of
> > the initial design. Sure, it might use functions/capabilities that
> > pg_checksums also use, but I don't know that we need to think about the
> > code sharing there being much more than that, just that those
> > capabilities should be built out in a way that they can be used for
> > multiple things (and based on what I saw, that looks like it's exactly
> > how that code was being written already).
>
> Yes, we need to see how we are going to do that for checksums and
> encryption and come up with a plan.

This is already being done though- Andres has a patch posted already and
my recollection from a quick look at that is that it should work just
fine for enabling checksums as well as potentially moving a table to be
encrypted online- the main common bit being that we need a way to say
"OK, everything has been done but we need to flip this flag and make
sure that everyone knows that this is now all checksum'd or all
encrypted". The only thing there that I'm not sure about is that when
it comes to checksums, I believe the idea is that it's cluster-wide,
while with encryption that would only be true if we were trying to do
something like move the entire cluster from unencrypted to encrypted in
an online fashion (including WAL, CLOG, et al...) and if that's the case
then there's a bunch of other complicated bits, I believe, that we'd
have to work out, and I don't really think it's necessary or sensible to
worry about that right now. Those are problems we don't currently have
with checksums either- the WAL already has them and I don't think
anyone's trying to address the fact that other rather core pieces of
the system don't currently.

> > > > There seems to be a strong thrust on this thread to assume that a
> > > > database MUST go from ALL DECRYPTED to ALL ENCRYPTED in one shot (and
> > > > therefore we have to shut down the server to do it), but I don't get why
> > > > that's the case, particularly if we support any kind of mixed setup
> > > > where there's some data that's encrypted and some that isn't, and since
> > > > we're talking about using different keys for different things, it seems
> > > > to me that we almost certainly should be able to easily say "well,
> > > > there's no key for this, so just don't go through the decrypt/encrypt
> > > > routines".
> > >
> > > No, we can't easily do different keys for different things since all the
> > > keys have to be online for crash recovery, so there isn't much value to
> > > having different keys since they always have to be online.
> >
> > Wasn't this already discussed? We should have a master key and then
> > additional keys for different tables, et al, which are encrypted with
> > the master key. Joe, I believe, covered all this quite well.
>
> Yes, I am disagreeing with that. I posted a 5-option email that went
> over the issue and explored the value in different keys. I am still
> unclear of the benefit since it doesn't fix the 68GB limit unless we do
> it per 1GB file, and we don't even know if that limit is per key or per
> key/IV combo. We can't move ahead until we decide that.

I understand the 68G limit that you're referring to to be key/IV combo,
which means that a key per relation should be just fine.

Even if it was per key, and it means having a key per 1GB file,
that wouldn't change the point that I was making, so I'm not entirely
sure why it's being mentioned in this context.

I disagree with any approach that lacks a master key with additional
sub-keys, if that helps clarify things.

Thanks,

Stephen

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2019-07-26 00:43:45 Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)
Previous Message Peter Geoghegan 2019-07-26 00:02:21 Re: ON CONFLICT (and manual row locks) cause xmax of updated tuple to unnecessarily be set