Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Joe Conway <mail(at)joeconway(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Antonin Houska <ah(at)cybertec(dot)at>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, "Moon, Insung" <Moon_Insung_i3(at)lab(dot)ntt(dot)co(dot)jp>, Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)
Date: 2019-08-09 01:25:22
Message-ID: 20190809012522.nb4g6t7dj2svxxzz@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Aug 8, 2019 at 06:31:42PM -0400, Stephen Frost wrote:
> > >Crash recovery doesn't happen "all the time" and neither does vacuum
> > >freeze, and autovacuum processes are independent of individual client
> > >backends- we don't need to (and shouldn't) have the keys in shared
> > >memory.
> >
> > Don't people do physical replication / HA pretty much all the time?
>
> Strictly speaking, that isn't actually crash recovery, it's physical
> replication / HA, and while those are certainly nice to have it's no
> guarantee that they're required or that you'd want to have the same keys
> for them- conceptually, at least, you could have WAL with one key that
> both sides know and then different keys for the actual data files, if we
> go with the approach where the WAL is encrypted with one key and then
> otherwise is plaintext.

Uh, yes, you could have two encryption keys in the data directory, one
for heap/indexes, one for WAL, both unlocked with the same passphrase,
but what would be the value in that?

> > >>That might allow crash recovery and the freeze part of VACUUM FREEZE to
> > >>work. (I don't think we could vacuum since we couldn't read the index
> > >>pages to find the matching rows since the index values would be encrypted
> > >>too. We might be able to not encrypt the tid in the index typle.)
> > >
> > >Why do we need the indexed values to vacuum the index..? We don't
> > >today, as I recall. We would need the tids though, yes.
> >
> > Well, we also do collect statistics on the data, for example. But even
> > if we assume we wouldn't do that for encrypted indexes (which seems like
> > a pretty bad idea to me), you'd probably end up leaking information
> > about ordering of the values. Which is generally a pretty serious
> > information leak, AFAICS.
>
> I agree entirely that order information would be bad to leak- but this
> is all new ground here and we haven't actually sorted out what such a
> partially encrypted btree would look like. We don't actually have to
> have the down-links in the tree be unencrypted to allow vacuuming of
> leaf pages, after all.

Agreed, but I think we kind of know that the value in cluster-wide
encryption is different from multi-key encryption --- both have their
value, but right now cluster-wide is the easiest and simplest, and
probably meets more user needs than multi-key encryption. If others
want to start scoping out what multi-key encryption would look like, we
can discuss it. I personally would like to focus on cluster-wide
encryption for PG 13.

> > >>Is this something considering in version one of this feature? Probably
> > >>not, but later? Never? Would the information leakage be too great,
> > >>particularly from indexes?
> > >
> > >What would be leaking from the indexes..? That an encrypted blob in the
> > >index pointed to a given tid? Wouldn't someone be able to see that same
> > >information by looking directly at the relation too?
> >
> > Ordering of values, for example. Depending on how exactly the data is
> > encrypted we might also be leaking information about which values are
> > equal, etc. It also seems quite a bit more expensive to use such index.
>
> Using an encrypted index isn't going to be free. It's not clear that
> this would be much more expensive than if the entire index is encrypted,
> or that people would actually be unhappy if there was such an additional
> expense if it meant that they could have vacuum run without the keys.

Yes, I think it is information leakage that is always going to make
multi-key unable to fulfill all the features of cluster-wide encryption.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Ancient Roman grave inscription +

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2019-08-09 01:51:49 Re: partition routing layering in nodeModifyTable.c
Previous Message David Rowley 2019-08-09 00:52:15 Re: POC: converting Lists into arrays