Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)

From: Sehrope Sarkuni <sehrope(at)jackdb(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: "Jonathan S(dot) Katz" <jkatz(at)postgresql(dot)org>, Bruce Momjian <bruce(at)momjian(dot)us>, Joe Conway <mail(at)joeconway(dot)com>, Antonin Houska <ah(at)cybertec(dot)at>, Stephen Frost <sfrost(at)snowman(dot)net>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, "Moon, Insung" <Moon_Insung_i3(at)lab(dot)ntt(dot)co(dot)jp>, Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)
Date: 2019-07-30 13:44:59
Message-ID: CAH7T-apN3Mg1iF9cEVSQfX8cz30bsytKb5amDu51B75C-yXX1w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jul 30, 2019 at 8:16 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
wrote:

> On Mon, Jul 29, 2019 at 8:18 PM Sehrope Sarkuni <sehrope(at)jackdb(dot)com>
> wrote:
> >
> > On Mon, Jul 29, 2019 at 6:42 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
> wrote:
> > > > An argument could be made to push that problem upstream, i.e. let the
> > > > supplier of the passphrase deal with the indirection. You would still
> > > > need to verify the supplied passphrase/key is correct via something
> > > > like authenticating against a stored MAC.
> > >
> > > So do we need the key for MAC of passphrase/key in order to verify?
> >
> > Yes. Any 128 or 256-bit value is a valid AES key and any 16-byte input
> > can be "decrypted" with it in both CTR and CBC mode, you'll just end
> > up with garbage data if the key does not match. Verification of the
> > key prior to usage (i.e. starting DB and encrypting/decrypting data)
> > is a must as otherwise you'll end up with all kinds of corruption or
> > data loss.
> >
>
> Do you mean that we encrypt and store a 16 byte input with the correct
> key to the disk, and then decrypt it with the user supplied key and
> compare the result to the input data?
>

Yes but we don't compare via decryption of a known input. We instead
compute a MAC of the encrypted master key using the user supplied key, and
compare that against an expected MAC stored alongside the encrypted master
key.

The pseudo code would be something like:

// Read key text from user:
string raw_kek = read_from_user()
// Normalize it to a fixed size of 64-bytes
byte[64] kek = SHA512(SHA512(raw_kek))
// Split the 64-bytes into a separate encryption and MAC key
byte[32] user_encryption_key = kek.slice(0,32)
byte[32] user_mac_key = kek.slice(32, 64)

// Read our saved MAC and encrypted master key
byte[80] mac_iv_encrypted_master_key = read_from_file()
// First 32-bytes is the MAC of the rest
byte[32] expected_mac = mac_iv_encrypted_master_key.slice(0, 32)
// Rest is a random IV + Encrypted master key
byte[48] iv_encrypted_master_key = mac_iv_encrypted_master_key(32, 80)

// Compute the MAC with the user supplied key
byte[32] actual_mac = HMAC(user_mac_key, iv_encrypted_master_key)
// If it does not match then the user key is invalid
if (actual_mac != expected_mac) {
print_err_and_exit("Bad user key!")
}

// Our MAC was correct
// ... so we know user supplied key is valid
// ... and we know our iv and encrypted_key are valid
byte[16] iv = iv_encrypted_master_key.slice(0,16)
byte[32] encrypted_master_key = iv_encrypted_master_key.slice(16, 48)
// ... so we can use all three to decrypt the master key (MDEK)
byte[32] master_key = decrypt_aes_cbc(user_encryption_key, iv,
encrypted_master_key)

> From a single user supplied passphrase you would derive the MDEK and
> > compute a MAC (either using the same key or via a separate derived
> > MDEK-MAC key). If the computed MAC matches against the previously
> > stored value then you know the MDEK is correct as well.
>
> You meant KEK, not MDEK?
>

If the KEK is incorrect then the MAC validation would fail and the decrypt
would never be attempted.

If the MAC matches then both the KEK (user supplied key) and MDEK
("master_key" in the pseudo code above) would be confirmed to be valid. So
the MDEK is safe to use for deriving keys for encrypt / decrypt.

I'm using the definitions for "KEK" and "MDEK" from Joe's mail
https://www.postgresql.org/message-id/c878de71-a0c3-96b2-3e11-9ac2c35357c3%40joeconway.com

Regards,
-- Sehrope Sarkuni
Founder & CEO | JackDB, Inc. | https://www.jackdb.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2019-07-30 14:05:35 Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)
Previous Message Tom Lane 2019-07-30 13:40:54 Re: tap tests driving the database via psql