Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Joe Conway <mail(at)joeconway(dot)com>
Cc: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Antonin Houska <ah(at)cybertec(dot)at>, Stephen Frost <sfrost(at)snowman(dot)net>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, "Moon, Insung" <Moon_Insung_i3(at)lab(dot)ntt(dot)co(dot)jp>, Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)
Date: 2019-07-15 19:50:16
Message-ID: 20190715195016.iz4cchkquub6eocn@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Jul 14, 2019 at 12:13:45PM -0400, Joe Conway wrote:
> On 7/13/19 5:58 PM, Tomas Vondra wrote:
> In my email I linked the wrong page for [2]. The correct one is here:
> [2] https://www.kernel.org/doc/html/latest/filesystems/fscrypt.html
>
> Following that, I think we could end up with three tiers:
>
> 1. A master key encryption key (KEK): this is the ley supplied by the
> database admin using something akin to ssl_passphrase_command
>
> 2. A master data encryption key (MDEK): this is a generated key using a
> cryptographically secure pseudo-random number generator. It is
> encrypted using the KEK, probably with Key Wrap (KW):
> or maybe better Key Wrap with Padding (KWP):
>
> 3a. Per table data encryption keys (TDEK): use MDEK and HKDF to generate
> table specific keys.

Uh, when was per-table encryption keys discussed? Uses pg_class.oid or
relfilenode?

> 3b. WAL data encryption keys (WDEK): Similarly use MDEK and a HKDF to
> generate new keys when needed for WAL (based on the other info we
> need to change WAL keys every 68 GB unless I read that wrong).

I thought we were going to use the WAL segement number for each 16MB
file so eachy 16MB gets a new nonce.

> I believe that would allows us to have multiple keys but they are
> derived securely from the one DEK using available info similar to the
> way we intend to use LSN to derive the IVs -- perhaps table.oid for
> tables and something else for WAL.

Ah, got it. We might want to use relfilenode (and have pg_upgrade
preserve it) to avoid having to do catalog lookups during WAL recovery.
However, I thought we were still unclear if that 68GB is per secret or
per nonce/secret.

> > One extra thing we should consider is authenticated encryption. We can't
> > just encrypt the pages (no matter which AES mode is used - XTS/CBC/...),
> > as that does not provide integrity protection (i.e. can't detect when
> > the ciphertext was corrupted due to disk failure or intentionally). And
> > we can't quite rely on checksums, because that checksums the plaintext
> > and is stored encrypted.
>
> I agree that authenticated encryption would be a good goal. I'm not sure
> we need to require it for the first version, although it would mean
> another option for the encryption type. That may be another good reason
> to allow both AES 128 and AES 256 CTR/CBC in the first version, as it
> will hopefully ensure that when we add different modes later it will be
> less painful.
>
> We could check the CRC prior to encryption and throw an ERROR if it is
> not correct. After decryption we can check it again -- if it no longer
> matches we would know there way a corruption or change of the
> ciphertext, no?

Yes, that is my hope too.

> Hmm, I guess the entire page of ciphertext could be faked including CRC,
> so this would only really cover corruption, not an intentional change if
> it were done properly.

Uh, how would they get a CRC to decrypt to match their page contents?

> > Which seems pretty annoying, because then the checksums won't verify
> > data as sent to the storage system, and verify checksums would require
> > access to all keys (how do you do that in offline mode?).
>
> Given the scheme above I don't see why that would be an issue. The keys
> are all accessible via the MDEK, which is in turn available via the KEK.

Yep.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Ancient Roman grave inscription +

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2019-07-15 19:55:38 Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)
Previous Message Robert Eckhardt 2019-07-15 19:50:03 Re: Creating partitions automatically at least on HASH?