Re: Internal key management system

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Craig Ringer <craig(dot)ringer(at)enterprisedb(dot)com>
Cc: Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>, Cary Huang <cary(dot)huang(at)highgo(dot)ca>, Ahsan Hadi <ahsan(dot)hadi(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "Moon, Insung" <tsukiwamoon(dot)pgsql(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Sehrope Sarkuni <sehrope(at)jackdb(dot)com>, cary huang <hcary328(at)gmail(dot)com>, Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>, Joe Conway <mail(at)joeconway(dot)com>
Subject: Re: Internal key management system
Date: 2020-10-26 15:02:36
Message-ID: 20201026150236.GX16415@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Greetings,

* Craig Ringer (craig(dot)ringer(at)enterprisedb(dot)com) wrote:
> On Mon, Oct 19, 2020 at 11:16 AM Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com> wrote:
> > The patch introduces only key management infrastructure but with no
> > key. Currently, there is no interface to dynamically add a new
> > encryption key.
>
> I'm a bit confused by the exact intent and use cases behind this patch.
> https://www.postgresql.org/message-id/17156d2e419.12a27f6df87825.436300492203108132%40highgo.ca
> that was somewhat helpful but not entirely clear.
>
> The main intent of this proposal seems to be to power TDE-style encryption
> of data at rest, with a single master key for the entire cluster. Has any
> consideration been given to user- or role-level key management as part of
> this, or is that expected to be done separately and protected by the master
> key supplied by this patch?

I've not been following very closely, but I definitely agree with the
general feedback here (more on that below), but to this point- I do
believe that was the intent, or at least I sure hope that it was. Being
able to have user/role keys would certainly be good. Having a way for a
user to log in and unlock their key would also be really nice.

> If so, what if I have a HSM (or virtualised or paravirt or network proxied
> HSM) that I want to use to manage my database keys, such that the database
> master key is protected by the HSM? Say I want to put my database key in a
> smartcard, my machine's TPM, a usb HSM, a virtual HSM provided by my
> VM/cloud platform, etc?
>
> As far as I can tell with the current design I'd have to encrypt my unlock
> passphrase and put it in the cluster_passphrase_command script or its
> arguments. The script would ask the HSM to decrypt the key passphrase and
> write that over stdio where Pg would read it and use it to decrypt the
> master key(s). That would work - but it should not be necessary and it
> weakens the protection offered by the HSM considerably.

Yeah, I do think this is how you'd need to do it and I agree that it'd
be better to offer an option that can go to the HSM directly. That
said- I don't think we necessarily want to throw out tho command-based
option, as users may wish to use a vaulting solution or similar instead
of an HSM. What I am curious about though- what are the thoughts around
using a vaulting solution's command-line tool vs. writing code to work
with an API? Between these various options, what are the risks of
having a script vs. using an API and would one or the other weaken the
overall solution? Or is what's really needed here is a way to tell us
if it's a passphrase we're getting or a proper key, regardless of the
method being used to fetch it?

> I suggest we allow the user to supply their own KEK via a
> cluster_encryption_key GUC. If set, Pg would create an SSLContext with the
> supplied key and use that SSLContext to decrypt the application keys - with
> no intermediate KEK-derivation based on cluster_passphrase_command
> performed. cluster_encryption_key could be set to an openssl engine URI, in
> which case OpenSSL would transparently use the supplied engine (usually a
> HSM) to decrypt the application keys. We'd install the
> cluster_passphrase_command as an openssl askpass callback so that if the
> HSM requires an unlock password it can be provided - like how it's done for
> libpq in Pg 13. Some thought is required for how to do key rotation here,
> though it matters a great deal less when a HSM is managing key escrow.

This really locks us into OpenSSL for this, which I don't particularly
like. If we do go down this route, we should definitely make it clear
that this is for use when PG has been built with OpenSSL, ie:
openssl_cluster_encryption_key as the parameter name, or such.

> For example if I want to lock my database with a YubiHSM I would configure
> something like:
>
> cluster_encryption_key = 'pkcs11:token=YubiHSM;id=0:0001;type=private'
>
> The DB would be encrypted and decrypted using application keys unlocked by
> the HSM. Backups of the database, stolen disk images, etc, would be
> unreadable unless you have access to another HSM with the same key loaded.

Well, you would surely just need the key, since you could change the PG
config to fetch the key from whereever you have it, you wouldn't need an
actual HSM..

> If cluster_encryption_key is unset, Pg would perform its own KEK derivation
> based on cluster_passphrase_command as currently implemented.

To what I was suggesting above- what if we just had a GUC that's
"kek_method" with options 'passphrase' and 'direct', where passphrase
goes through KEK and 'direct' doesn't, which just changes how we treat
the results of called cluster_passphrase_command?

> I really don't think we should be adopting something that doesn't consider
> platform based hardware key escrow and protection.

I agree that we should consider platform based hardware key escrow and
protection. I'm generally supportive of trying to do so in a way that
keeps things very flexible for users without us having to write a lot of
code that's either library-specific or solution-specific.

Thanks,

Stephen

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2020-10-26 15:27:46 Re: new heapcheck contrib module
Previous Message Zhenghua Lyu 2020-10-26 15:01:41 Re: Should the function get_variable_numdistinct consider the case when stanullfrac is 1.0?