Re: Internal key management system

From: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
To: Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Cary Huang <cary(dot)huang(at)highgo(dot)ca>, Ahsan Hadi <ahsan(dot)hadi(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "Moon, Insung" <tsukiwamoon(dot)pgsql(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Sehrope Sarkuni <sehrope(at)jackdb(dot)com>, cary huang <hcary328(at)gmail(dot)com>, Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>, Joe Conway <mail(at)joeconway(dot)com>
Subject: Re: Internal key management system
Date: 2020-06-12 20:59:37
Message-ID: alpine.DEB.2.22.394.2006122012390.473586@pseudo
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


Hello Masahiko-san,

> Summarizing the discussed points so far, I think that the major
> advantage points of your idea comparing to the current patch's
> architecture are:
>
> * More secure. Because it never loads KEK in postgres processes we can
> lower the likelihood of KEK leakage.

Yes.

> * More extensible. We will be able to implement more protocols to
> outsource other operations to the external place.

Yes.

> On the other hand, here are some downsides and issues:
>
> * The external place needs to manage more encryption keys than the
> current patch does.

Why? If the external place is just a separate process on the same host,
probably it would manage the very same amount as what your patch.

> Some cloud key management services are charged by the number of active
> keys and key operations. So the number of keys postgres requires affects
> the charges. It'd be worse if we were to have keys per table.

Possibly. Note that you do not have to use a cloud storage paid as a
service. However, you could do it if there is an interface, because it
would allow postgres to do so if the user wishes that. That is the point
of having an interface that can be implemented differently for different
use cases.

> * If this approach supports only GET protocol, the user needs to
> create encryption keys with appropriate ids in advance so that
> postgres can get keys by id. If postgres TDE creates keys as needed,
> CREATE protocol would also be required.

I'm not sure. ISTM that if there is a KMS to manage keys, it could be its
responsability to actually create a key, however the client (pg) would
have to request it, basically say "given me a new key for this id".

This could even work with a "get" command only, if the KMS is expected to
create a new key when asked for a key which does not exists yet. ISTM that
the client could (should?) only have to create identifiers for its keys.

> * If we need only GET protocol, the current approach (i.g.
> cluser_passphase_command) would be more simple. I imagine the
> interface between postgres and the extension is C function.

Yes. ISTM that can be pretty simple, something like:

A guc to define the process to start the interface (having a process means
that its uid can be changed), which would communicate on its stdin/stdout.

A guc to define how to interact with the interface (eg whether DEK are
retrieved, or whether the interface is to ask for encryption/decryption,
or possibly some other modes).

A few function:

- set_key(<local-id:int>, <key-identifier:bytea>);
# may retrieve the DEK, or only note that local id of some key.

- encode(<local-id:int>, <data:bytea>) -> <encrypted-data:bytea>
# may fail if no key is associated to local-id
# or if the service is down somehow

- decode(<local-id>, <encrypted-data>) -> <data>
# could also fail if there is some integrity check associated

> This approach is more extensible

Yep.

> but it also means extensions need to support multiple protocols, leading
> to increase complexity and development cost.

I do not understand what you mean by "multiple protocols". For me there is
one protocol, possibly a few commands in this protocol between client
(postgres) and DMS. Anyway, sending "GET <key-id>" to retreive a DEK, for
instance, does not sound "complex". Here is some pseudo code:

For get_key:

if (mode of operation is to have DEKS locally)
try
send to KMS "get <key-id>"
keys[local-id] = answer
catch & rethrow possible errors
elif (mode is to keep DEKs remote)
key_id[local-id] = key-id;
else ...

For encode, the code is basically:

if (has_key(local-id))
if (mode of operation is to have DEKs locally)
return some_encode(key[local-id], data);
elif (mode is to keep DEKs remote)
send to KMS "encode key_id[local-id] data"
return answer; # or error
else ...
else
throw error local-id has no associated key;

decode is more or less the same as encode.

There is another thing to consider is how the client "proves" its identity
to the KMS interface, which might suggest some provisions when starting a
process, but you already have things in your patch to deal with the KEK,
which could be turned into some generic auth.

> * This approach necessarily doesn’t eliminate the data leakage threat
> completely caused by process compromisation.

Sure, if the process as decrypted data or DEK or whatever, then the
process compromission leaks these data. My point is to try to limit the
leakage potential of a process compromission.

> Since DEK is placed in postgres process memory,

May be placed, depending on the mode of operation.

> it’s still possible that if a postgres process is compromised the
> attacker can steal database data.

Obviously. This cannot be helped if pg is to hold unencrypted data.

> The benefit of lowering the possibility of KEK leakage is to deal with
> the threat that the attacker sees database data encrypted by past or
> future DEK protected by the stolen KEK.

Yes.

> * An open question is, as you previously mentioned, how to verify the
> key obtained from the external place is the right key.

It woud succeed in decrypting data if there is some associated integrity
check.

Note that from a cryptographic point if view, depending on the use case,
it may be a desirable property that you cannot tell whether it is the
right one.

> Anything else we need to note?

Dunno.

I would like to see some thread model, and what properties you would
expect depending on the hypothesis.

For instance, I guess that the minimal you would like is that stolen
database cold data (PGDATA contents) should not allow to recover clear
contents, but only encrypted stuff which is the whole point of encrypting
data in the first place. This is the "here and now".

ISTM that the only possible achievement of the current patch is the above.

Then you should also consider past data (prior states of PGDATA which may
have been stored somewhere the attacker might recover) and future data
(that the attacker may be able to recover later).

Now what happens on those (past, present, future) data on:

- stolen DEK

- stolen KEK

- stolen full cold data (whole disk stolen)

- access to process & live data
(pg account compromission at some point in time)

- access process & live data & ability to issue more commands at some
point in time...

- access to full host live data (root compromission)

- ...

- network full compromission (eg AD has been subverted, this is the usual
target for taking down everything on a network if every
authentication and authorization is managed by it, which is often
the case in a corporate network).

- the pg admin is working for the attacker...

- the sys admin is working for the attacker...

- ...

In the end anyway you would lose, the question is how soon, how many
compromissions are necessary.

> Finally, please understand I’m not controverting your idea but just
> trying to understand which approach is better from multiple
> perspectives.

The point of a discussion is basically to present arguments.

--
Fabien.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Mark Dilger 2020-06-12 21:06:18 Re: new heapcheck contrib module
Previous Message Chapman Flack 2020-06-12 20:17:56 Re: what can go in root.crt ?