Re: Internal key management system

From: Chris Travers <chris(dot)travers(at)adjust(dot)com>
To: Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Sehrope Sarkuni <sehrope(at)jackdb(dot)com>, cary huang <hcary328(at)gmail(dot)com>, "Moon, Insung" <tsukiwamoon(dot)pgsql(at)gmail(dot)com>, Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>, Joe Conway <mail(at)joeconway(dot)com>, Bruce Momjian <bruce(dot)momjian(at)enterprisedb(dot)com>
Subject: Re: Internal key management system
Date: 2020-02-03 02:37:01
Message-ID: CAN-RpxD747TDj=CVTeuZBn=tLgg4xkdUVm0r0N3-gQvQj38V0Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi;

So I actually have tried to do carefully encrypted data in Postgres via
pg_crypto. I think the key management problems in PostgreSQL are separable
from table-level encryption. In particular the largest problem right now
with having encrypted attributes is accidental key disclosure. I think if
we solve key management in a way that works for encrypted attributes first,
we can then add encrypted tables later.

Additionally big headaches come with key rotation. So here are my thoughts
here. This is a fairly big topic. And I am not sure it can be done
incrementally as much as that seems to doom big things in the community,
but I think it could be done with a major push by a combination of big
players, such as Second Quadrant.

On Sun, Feb 2, 2020 at 3:02 AM Masahiko Sawada <
masahiko(dot)sawada(at)2ndquadrant(dot)com> wrote:

> Hi,
>
> I've started a new separate thread from the previous long thread[1]
> for internal key management system to PostgreSQL. As I mentioned in
> the previous mail[2], we've decided to step back and focus on only
> internal key management system for PG13. The internal key management
> system introduces the functionality to PostgreSQL that allows user to
> encrypt and decrypt data without knowing the actual key. Besides, it
> will be able to be integrated with transparent data encryption in the
> future.
>
> The basic idea is that PostgreSQL generates the master encryption key
> which is further protected by the user-provided passphrase. The key
> management system provides two functions to wrap and unwrap the secret
> by the master encryption key. A user generates a secret key locally
> and send it to PostgreSQL to wrap it using by pg_kmgr_wrap() and save
> it somewhere. Then the user can use the encrypted secret key to
> encrypt data and decrypt data by something like:
>

So my understanding is that you would then need something like:

1. Symmetric keys for actual data storage. These could never be stored in
the clear.
2. User public/private keys to use to access data storage keys. The
private key would need to be encrypted with passphrases. And the server
needs to access the private key.
3. Symmetric secret keys to encrypt private keys
4. A key management public/private key pair used to exchange the password
for the private key.

>
> INSERT INTO tbl VALUES (pg_encrypt('user data', pg_kmgr_unwrap('xxxxx'));
> SELECT pg_decrypt(secret_column, pg_kmgr_unwrap('xxxxx')) FROM tbl;
>

If you get anything wrong you risk logs being useful to break tne
encryption keys and make data access easy. You don't want
pg_kmgr_unwrap('xxxx') in your logs.

Here what I would suggest is a protocol extension to do the key exchange.
In other words, protocol messages to:
1. Request data exchange server public key.
2. Send server public-key encrypted symmetric key. Make sure it is
properly padded etc.

These are safe still only over SSL with sslmode=full_verify since otherwise
you might be vulnerable to an MITM attack.

Then the keys should be stored in something like CacheMemoryContext and
pg_encrypt()/pg_decrypt() would have access to them along with appropriate
catalogs needed to get to the storage keys themselves.

>
> Where 'xxxxx' is the result of pg_kmgr_wrap function.
>
> That way we can get something encrypted and decrypted without ever
> knowing the actual key that was used to encrypt it.
>
> I'm currently updating the patch and will submit it.
>

The above though is only a small part of the problem. What we also need
are a variety of new DDL commands specifically for key management. This is
needed because without commands of this sort, we cannot make absolutely
sure that the commands are never logged. These commands MUST not have keys
logged and therefore must have keys stripped prior to logging. If I were
designing this:

1. Users on an SSL connection would be able to: CREATE ENCRYPTION USER
KEY PAIR WITH PASSWORD 'xyz' which would automatically rotate keys.
2. Superusers could: ALTER SYSTEM ROTATE ENCRYPTION EXCHANGE KEY PAIR;
3. Add an ENCRYPTED attribute to columns and disallow indexing of
ENCRYPTED columns. This would store keys for the columns encrypted with
user public keys where they have access.
4. Allow superusers to ALTER TABLE foo ALTER encrypted_column ROTATE KEYS;
which would naturally require a full table rewrite.

Now, what that proposal does not provide is the use of encryption to
enforce finer-grained access such as per-row keys but that's another topic
and maybe something we don't need.

However I hope that explains what I see as a version of a minimum viable
infrastructure here.

>
> On Sun, 2 Feb 2020 at 00:37, Sehrope Sarkuni <sehrope(at)jackdb(dot)com> wrote:
> >
> > On Fri, Jan 31, 2020 at 1:21 AM Masahiko Sawada
> > <masahiko(dot)sawada(at)2ndquadrant(dot)com> wrote:
> > > On Thu, 30 Jan 2020 at 20:36, Sehrope Sarkuni <sehrope(at)jackdb(dot)com>
> wrote:
> > > > That
> > > > would allow the internal usage to have a fixed output length of
> > > > LEN(IV) + LEN(HMAC) + LEN(DATA) = 16 + 32 + 64 = 112 bytes.
> > >
> > > Probably you meant LEN(DATA) is 32? DATA will be an encryption key for
> > > AES256 (master key) internally generated.
> >
> > No it should be 64-bytes. That way we can have separate 32-byte
> > encryption key (for AES256) and 32-byte MAC key (for HMAC-SHA256).
> >
> > While it's common to reuse the same 32-byte key for both AES256 and an
> > HMAC-SHA256 and there aren't any known issues with doing so, when
> > designing something from scratch it's more secure to use entirely
> > separate keys.
>
> The HMAC key you mentioned above is not the same as the HMAC key
> derived from the user provided passphrase, right? That is, individual
> key needs to have its IV and HMAC key. Given that the HMAC key used
> for HMAC(IV || ENCRYPT(KEY, IV, DATA)) is the latter key (derived from
> passphrase), what will be the former key used for?
>
> >
> > > > For the user facing piece, padding would enabled to support arbitrary
> > > > input data lengths. That would make the output length grow by up to
> > > > 16-bytes (rounding the data length up to the AES block size) plus one
> > > > more byte if a version field is added.
> > >
> > > I think the length of padding also needs to be added to the output.
> > > Anyway, in the first version the same methods of wrapping/unwrapping
> > > key are used for both internal use and user facing function. And user
> > > input key needs to be a multiple of 16 bytes value.
> >
> > A separate length field does not need to be added as the
> > padding-enabled output will already include it at the end[1]. This
> > would be handled automatically by the OpenSSL encryption / decryption
> > operations (if it's enabled):
> >
>
> Yes, right.
>
> Regards,
>
> [1]
> https://www.postgresql.org/message-id/031401d3f41d%245c70ed90%241552c8b0%24%40lab.ntt.co.jp
> [2]
> https://www.postgresql.org/message-id/CAD21AoD8QT0TWs3ma-aB821vwDKa1X519y1w3yrRKkAWjhZcrw%40mail.gmail.com
>
> --
> Masahiko Sawada http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>
>
>

--
Best Regards,
Chris Travers
Head of Database

Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com
Saarbrücker Straße 37a, 10405 Berlin

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2020-02-03 04:17:17 Re: pg_stat_progress_basebackup - progress reporting for pg_basebackup, in the server side
Previous Message Amit Langote 2020-02-03 02:36:55 Re: Autovacuum on partitioned table