Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)

From: Sehrope Sarkuni <sehrope(at)jackdb(dot)com>
To: Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, cary huang <hcary328(at)gmail(dot)com>, Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>, Insung Moon <tsukiwamoon(dot)pgsql(at)gmail(dot)com>
Subject: Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)
Date: 2020-01-25 16:35:23
Message-ID: CAH7T-aoxAU5+cCZ=P4nzc-ZhGj_aqRWJWA-xrg1FT6e8ZwC65g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

I took a look at this patch. With some additions I think the feature
itself is useful but the patch needs more work. It also doesn't have
any of its own automated tests yet so the testing below was done
manually.

The attached file, kms_v2.patch, is a rebased version of the
kms_v1.patch that fixes some bit rot. It sorts some of the Makefile
additions but otherwise is the original patch. This version applies
cleanly on master and passes make check.

I don't have a Windows machine to test it, but I think the Windows
build files for these changes are missing. The updated
src/common/Makefile has a comment to coordinate updates to
Mkvcbuild.pm but I don't see kmgr_utils.c or cipher_openssl.c
referenced anywhere in there.

The patch adds "pg_kmgr" to the list of files to skip in
pg_checksums.c but there's no additional "pg_kmgr" file written to the
data directory. Perhaps that's from a prior version that saved data to
its own file?

The constant AES128_KEY_LEN is defined in cipher.c but it's not used
anywhere. RE: AES-128, not sure the value of even supporting it for
this feature (v.s. just supporting AES-256). Unlike something like
table data encryption, I'd expect a KMS to be used much less
frequently so any performance boost of AES-128 vs AES-256 would be
meaningless.

The functions pg_cipher_encrypt(...), pg_cipher_decrypt(...), and
pg_compute_HMAC(...) return true if OpenSSL is not configured. Should
that be false? The ctx init functions all return false when not
configured. I don't think that code path would ever be reached as you
would not have a valid context but seems more consistent to have them
all return false.

There's a comment referring to "Encryption keys (TDEK and WDEK)
length" but this feature is only for a KMS so that should be renamed.

The passphrase is hashed to split it into two 32-byte keys but the min
length is only 8-bytes:

#define KMGR_MIN_PASSPHRASE_LEN 8

... that should be at least 64-bytes to reflect how it's being used
downstream. Depending on the format of the passphrase commands output
it should be even longer (ex: binary data in hex should really be
double that). The overall min should be 64-byte but maybe add a note
to the docs to explain how the output will be used and the expected
amount of entropy.

In pg_kmgr_wrap(...) it checks that the input is a multiple of 8 bytes:

if (datalen % 8 != 0)
ereport(ERROR,
(errmsg("input data must be multiple of 8 bytes")));

...but after testing it, the OpenSSL key wrap functions it invokes
require a multiple of 16-bytes (block size of AES). Otherwise you get
a generic error:

# SELECT pg_kmgr_wrap('abcd1234'::bytea);
ERROR: could not wrap the given secret

In ossl_compute_HMAC(...) it refers to AES256_KEY_LEN. Should be
SHA256_HMAC_KEY_LEN (they're both 32-bytes but naming is wrong)

return HMAC(EVP_sha256(), key, AES256_KEY_LEN, data,
(uint32) data_size, result, (uint32 *) result_size);

In pg_rotate_encryption_key(...) the error message for short
passphrases should be "at least %d bytes":

if (passlen < KMGR_MIN_PASSPHRASE_LEN)
ereport(ERROR,
(errmsg("passphrase must be more than %d bytes",
KMGR_MIN_PASSPHRASE_LEN)));

Rotating the passphrase via "SELECT pg_rotate_encryption_key()" and
restarting the server worked (good). Having the server attempt to
start with invalid output from the command gives an error "FATAL:
cluster passphrase does not match expected passphrase" (good).

Round tripping via wrap/unwrap works (good!):

# SELECT convert_from(pg_kmgr_unwrap(pg_kmgr_wrap('abcd1234abcd1234'::bytea)),
'utf8');
convert_from
------------------
abcd1234abcd1234
(1 row)

Trying to unwrap gibberish fails (also good!):

# SELECT pg_kmgr_unwrap('\x123456789012345678901234567890123456789012345678');
ERROR: could not unwrap the given secret

The pg_kmgr_wrap/unwrap functions use EVP_aes_256_wrap()[1] which
implements RFC 5649[2] with the default IVs so they always return the
same value for the same input:

# SELECT x, pg_kmgr_wrap('abcd1234abcd1234abcd1234') FROM
generate_series(1,5) x;
x | pg_kmgr_wrap
---+--------------------------------------------------------------------
1 | \x51041d1fe52916fd15f456c2b67108473d9bf536795e2b6d4db81c065c8cd688
2 | \x51041d1fe52916fd15f456c2b67108473d9bf536795e2b6d4db81c065c8cd688
3 | \x51041d1fe52916fd15f456c2b67108473d9bf536795e2b6d4db81c065c8cd688
4 | \x51041d1fe52916fd15f456c2b67108473d9bf536795e2b6d4db81c065c8cd688
5 | \x51041d1fe52916fd15f456c2b67108473d9bf536795e2b6d4db81c065c8cd688
(5 rows)

The IVs should be randomized so that repeated wrap operations give
distinct results. To do that, the output format needs to include the
randomized IV. It need not be secret but it needs to be included in
the wrapped output. Related, IIUC, the wrapping mechanism of RFC5649
does provide some integrity checking but it's only 64-bits (v.s. say
256-bits for a full HMAC-SHA-256).

Rather than use EVP_aes_256_wrap() with its defaults, we can generate
a random IV and have the output be "IV || ENCRYPT(KEY, IV, DATA) ||
HMAC(IV || ENCRYPT(KEY, IV, DATA))". For a fixed length internal input
(ex: the KEK encrypted key stored in pg_control) there's no need for
padding as we're dealing with multiples of 16-bytes (ex: KEK encrypted
enc-key / mac-key would be 64-bytes).

It'd also be useful if the user level wrap/unwrap API allowed for
arbitrary sized inputs (not just multiples of 16-byte). Having the
output be in a standard format (i.e. matching OpenSSL's
EVP_aes_256_wrap API) is nice, but as it's meant to be an opaque
interface I think it's fine if the output is not usable outside the
database. I don't see anyone using the wrapped data directly as it's
random bytes without the key. The primary contract for the interface:
"data == unwrap(wrap(data))". This would require enabling padding
which would round up the size of the output to the next 16-bytes.
Adding a prefix byte for a "version" would be nice too as it could be
used to infer the specific cipher/mac combo (Ex: v1 would be
AES256/HMAC-SHA256). I don't think the added size is an issue as
again, the output is opaque. Similar things can also be accomplished
by combining the 16-byte only version with pgcrypto but like this it'd
be usable out of the box without additional extensions.

[1]: https://www.openssl.org/docs/man1.1.1/man3/EVP_aes_256_wrap.html
[2]: https://tools.ietf.org/html/rfc5649

Regards,
-- Sehrope Sarkuni
Founder & CEO | JackDB, Inc. | https://www.jackdb.com/

Attachment Content-Type Size
kms_v2.patch application/x-patch 49.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nino Floris 2020-01-25 17:43:53 Re: [PATCH] ltree, lquery, and ltxtquery binary protocol support
Previous Message 曾文旌 (义从) 2020-01-25 15:15:48 Re: [Proposal] Global temporary tables