Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: "Jonathan S(dot) Katz" <jkatz(at)postgresql(dot)org>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Joe Conway <mail(at)joeconway(dot)com>, Antonin Houska <ah(at)cybertec(dot)at>, Stephen Frost <sfrost(at)snowman(dot)net>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, "Moon, Insung" <Moon_Insung_i3(at)lab(dot)ntt(dot)co(dot)jp>, Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)
Date: 2019-07-26 15:27:58
Message-ID: CAD21AoAB5+F0RAb5gHNV74CXrBYfQrvTPGx86MrEfVM=x4iPbQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jul 26, 2019 at 10:57 AM Jonathan S. Katz <jkatz(at)postgresql(dot)org> wrote:
>
> Hi,
>
> Before my reply, I wanted to say that I've been lurking on this thread
> for a bit as I've tried to better inform myself on encryption at rest
> and how it will apply to what we want to build. I actually built a
> (poor) prototype in Python of the key management system that Joe &
> Masahiko both laid out, in addition to performing some "buffer
> encrpytion" with it. It's not worth sharing at this point.
>
> With the disclaimer that I'm not as familiar with a lot of concepts as I
> would like to be:
>
> On 7/25/19 1:54 PM, Masahiko Sawada wrote:
> > On Fri, Jul 26, 2019 at 2:18 AM Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> >>
> >> On Thu, Jul 18, 2019 at 12:04:25PM +0900, Masahiko Sawada wrote:
> >>> I've re-considered the design of TDE feature based on the discussion
> >>> so far. The one of the main open question is the granular of
> >>> encryption objects: cluster encryption or more-granular-than-cluster
> >>> encryption. The followings describe about the new TDE design when we
> >>> choose table-level encryption or something-new-group-level encryption.
> >>>
> >>> General
> >>> ========
> >>> We will use AES and support both AES-128 and AES-256. User can specify
> >>> the new initdb option something like --aes-128 or --aes-256 to enable
> >>> encryption and must specify --encryption-key-passphrase-command along
> >>> with. (I guess we also require openssl library.) If these options are
> >>> specified, we write the key length to the control file and derive the
> >>> KEK and generate MDEK during initdb. wal_log_hints will be enabled
> >>> automatically in encryption mode, like we do for checksum mode,
> >>
> >> Agreed. pg_control will store the none/AES128/AES256 indicator.
> >>
> >>> Key Management
> >>> ==============
> >>> We will use 3-tier key architecture as Joe proposed.
> >>>
> >>> 1. A master key encryption key (KEK): this is the ley supplied by the
> >>> database admin using something akin to ssl_passphrase_command
> >>>
> >>> 2. A master data encryption key (MDEK): this is a generated key using a
> >>> cryptographically secure pseudo-random number generator. It is
> >>> encrypted using the KEK, probably with Key Wrap (KW):
> >>> or maybe better Key Wrap with Padding (KWP):
> >>>
> >>> 3a. Per table data encryption keys (TDEK): use MDEK and HKDF to generate
> >>> table specific keys.
> >>
> >> What is the value of a per-table encryption key? How is HKDF derived?
> >
> > Per-table encryption key is derived from MDEK with salt and its OID as
> > info. I think we can store salts for each encryption keys into the
> > separate file so that off-line tool also can read it.
>
> +1 with using the info/salt for the HKDF as described above. The other
> decision will be the hashing algorithm to use. SHA-256?

Yeah, SHA-256 would be better for safety.

>
>
> >>> 3b. WAL data encryption keys (WDEK): Similarly use MDEK and a HKDF to
> >>> generate new keys when needed for WAL.
> >>>
> >>> We store MDEK to the plain file (say global/pgkey) after encrypted
> >>> with the KEK. I might want to store the hash of passphrase of the KEK
> >>> in order to verify the correctness of the given passphrase. However we
> >>> don't need to store TDEK and WDEK as we can derive them as needed. The
> >>> key file can be read by both backend processes and front-end tools.
> >>
> >> Yes, we need to verify the pass phrase.
>
> Just to clarify, this would be a hash of the KEK?

No, it's a hash of passphrase. Or we might be able to use crypt(3) to
verify the input passphrase.

Apart from passing the passphrase, there are users who rather want to
pass the key directly, for example when using external key management
services. So it might be good if we provide both way.

>
> From my experiments, the MDEK key unwrapping fails if you do not have
> the correct KEK (as it should). If it's a matter of storing a hash of
> the KEK, I'm not sure if there is much added benefit to have it, but I
> would not necessarily oppose it either.
>
> >>> When postmaster startup, it reads the key file and decrypts MDEK and
> >>> derive WDEK using key id for WDEK.
>
> I don't know if this is getting too far ahead, but what happens if the
> supplied KEK fails to decrypt the MDEK? Will postmaster refuse to startup?

I think it should refuse to startup. It would not able to operate all
things properly without correct keys and we prevent to startup from
possible malicious user.

>
> >>> WDEK is loaded to the key hash map
> >>> (keyid -> key) on the shared memory. Also we derive TDEK as needed
> >>> when reading tables or indexes and add it to the key hash map as well
> >>> if not exists.
>
> +1 to this approach.
>
> >>>
> >>> Buffer Encryption
> >>> ==============
> >>> We will use AES-CBC for buffer encryption. We will add key id (4byte)
> >>
> >> I think we might want to use CTR for this, and will post after this.
>
> Not sure if I missed this post or not (as several people mentioned, it
> is easy to get lost in this thread).
>
> I think what will help drive this decision is whether or not we consider
> the data we are storing on disk as a "file system" in itself. Trying to
> make myself literate in disk encryption theory[1], it seems a big
> weakness in using CTR mode for encryption is we need to be able to
> guarantee a fresh counter for every page we encrypt[2], so if we can
> guarantee the uniqueness of IV per TDEK, this is on the table.
>
> XTS mode, on the other hand, appears to be more durable to reusing an IV
> as the "tweak" was designed to represent a disk sector, though there are
> still problems. However, I presume this is one of many reasons why
> fscrypt uses XTS[3].
>
> For data malleability, CTR is described to be more vulnerable, but both
> modes (all for that modes?) require some sort of digital signature (and
> most of my research has lead to Encrypt-then-MAC, which I know is being
> discussed elsewhere in the thread).
>
> >>
> >>> to after the pd_lsn(8byte) in PageHeaderData and we will not encrypt
> >>> first 16 byte of each pages so the LSN and key id can be used. We can
> >>> store an invalid key id to tell us that the table is not encrypted.
> >>> There two benefits of storing key id to the page header: offline tools
> >>> can get key id (and know the table is encrypted or not) and it's
> >>> helpful for online rekey in the future.
> >>
> >> I don't remember anyone suggesting different keys for different tables.
> >> How would this even be managed by the user?
> >
> > I think it's still unclear whether we implement one key for whole
> > database cluster or different keys for different table as the first
> > version. I'm evaluating the performance overhead of the latter that
> > you concerned and will share it.
> >
> > I prefer tablespace-level or something-new-group-level than
> > table-level but if we choose the latter we can create a new group of
> > tables that are encrypted with the same key. That is user create a
> > group and then associate tables to that group. Tablespace-level is
> > implemented in the patch I submitted before.
>
> I may not be following here...but the TDEKs are can be dervied with a
> (OID,salt) combination, so even if it was per tablespace we would be
> storing a salt -- I'm not sure how it would affect being per-table other
> than the additional overhead of storing the salt per table...
>
> ...I think the pain is realized if/when there is a TDEK rotation, i.e.
> the amount of data encrypted by the (OID,salt) pair exceeds
>
> > Or it's just idea but
> > another idea could be to allow users to create encryption key object
> > first and then specify which tables are encrypted with which
> > encryption key in DDL. For example, user creates an encryption keys
> > with name by SQL function and creates an encrypted table by CREATE
> > TABLE ... WITH (encryption_key = 'mykey');.
>
> -1 for storing encryption keys in the DDL. If someone has `log_statement
> = ddl` or above, those keys will get stored in plaintext to said logs.

Sorry, I meant to create an encrypted table by specifying the
encryption key. What I wanted to say is that user can create an
encryption key object by SQL function or other with the name (say
'mykey'). The encryption key here is TDEK, not MDEK. And then user can
specify the encryption key object when table creation by the name.
Therefore the key never be logged.

Furthermore, the encryption key object could also be used by pgcrypto;
currently we have to pass key itself as argument of pgcrypto function
such as decrypt() or encrypt() but we can change such functions so
that we can specify the name of key object instead. It's a just idea
though.

>
> I would be +1 for being able to explicitly set tables to be encrypted,
> and +1 for a GUC that turns on encryption for all tables. I see a lot of
> footguns with configurability and understand there are implementation
> headaches as well, but wanted to float the ideas.

It seems good idea.

>
> >>> I've considered to store IV and key id to a new fork but I felt that
> >>> it is complex because we will always need to have the fork on the
> >>> shared buffer when any pages of its main fork is written to the disk.
> >>> If almost buffers of the shared buffers are dirtied and theirs new
> >>> forks are not loaded to the shared buffer, we might need to load the
> >>> new fork and write the page to the disk and then evict some pages,
> >>> over and over.
> >>>
> >>> We will use (page lsn, page number) to create a nonce. IVs are created
> >>> by encrypting the nonce with its TDEK.
> >>
> >> Agreed.
>
> We just need to ensure this adds up to 16 bytes for the IV based on all
> of the encryption methods we are considering. I believe this gets us to
> 12, so we need 4 additional bytes.
>
> To echo an idea up thread, we could make this completely
> nondeterministic and keep a randomly generated IV on the page header
> (understanding this takes up even more space, and we may need some more
> space anyway based on the outcome of the MAC discussion). Or perhaps we
> just need to keep 4 bytes for a random salt on the page header that can
> be appended to the page LSN / page no. pair.
>
> >>
> >>> WAL Encryption
> >>> =============
> >>> We will use AES-CTR for WAL encryption and encrypt each WAL pages with WDEK.
> >>>
> >>> We will use WAL segment number to create a nonce. Similar to buffer
> >>> encryption, IVs are created using by the nonce and WDEK.
>
> Same comment as above RE needing 16 bytes for the IV, as well as
> possible solutions.
>
> >>
> >> Yes. If there is concern about collision of table/index and WAL IVs, we
> >> can add a constant to the two uses, as Joe Conway mentioned.
> >>
> >>> If we want to support enabling or disabling encryption after initdb we
> >>> might want to have key id in the WAL page header.
>
> Makes sense. I think the big question is if one enables encryption after
> initdb and after there is already data in the database, what happens?
> Sounds like it could be a bit of a challenge :)

I guess that when user requested to encrypt the table we can mark
every pages as that this page has to be encrypted before writing so as
the table is encrypted. For WAL, I've not consider deeply yet but we
might need to switch WAL and enables WAL encryption from the next WAL
file.

>
> >>>
> >>> Front-end Tool Support
> >>> ==================
> >>> We will add --encryption-key-passphrase-command option to the
> >>> front-end tools that read database files or WAL segment files directly.
> >>> They can get KEK via --encryption-key-passphrase-command and get MDEK
> >>> by reading the key file. Also they can know the key length by checking
> >>> the control file. Since they can derive TDEK using by key id stored in
> >>> the page header they can decrypt database files. Similarly, they also
> >>> can decrypt WAL as they can know the key id of WDEK.
>
> +1.
>
> >>>
> >>> Master Key Rotation
> >>> ================
> >>> We will support new command-line tool that rotates the master key
> >>> offline. It accepts --old-encryption-key-passphrase-command option and
> >>> --new-encryption-key-passphrase-command to get old KEK and new KEK
> >>> respectively. It decrypt MDEK with the old key and encrypt it with
> >>> the new key.
> >>
> >> That handles changing the passphrase, but what about rotating the
> >> encryption key? Don't we want to support that, at least in offline
> >> mode?
> >
> > Yeah, supporting rotating the encryption key is a good idea. Agreed.
>
> I think part of the reason for having the KEK is we can rotate the KEK
> without needing to rotate the MDEK.
>
> Rotating the MDEK could cause a pretty significant downtime event based
> on the size of your data. Perhaps something like that should be there
> for emergencies, but arguably rotating a MDEK would the the equivalent
> of a logical restore to another cluster.

Yeah, it actually depends on the size of your *encrypted* data. So if
we can encrypt only some important table the rotating of MDEK would
not take a long time.

>
> >
> > After more thoughts, it's a just idea but I wonder if the first
> > implementation step of TDE for v13 could be key management module.
> > That is, (in 3-tier case) PostgreSQL gets KEK by passphrase command or
> > directly, and creates MDEK. User can create an encryption key with
> > name using by SQL function, and the key manager derives DEK and store
> > its salt to the disk. Also we have an internal interface to get an
> > encryption key.
> >
> > The good point is not only to develop incrementally but also that if
> > PostgreSQL is able to manage (symmetric) encryption keys inside
> > database cluster and has interfaces to get and add keys, pgcrypt also
> > will be able to use it. That way, we will provide column-level TDE
> > first by combination of pgcrypt, triggers and views while keeping
> > encryption keys truly secret. After that we can add other level TDE
> > using the key management module. We would then be able to focus on how
> > to encrypt buffer and WAL.
>
> I think it is a logical starting point to get the key management module
> into place, as the rest of the systems to build out from there. That is
> how I built my (poor) prototype :)
>
> Given you can already get column level encryption with pgcrypto with
> external key management, my suggestion is to spend the effort getting
> the TDE architecture nailed down.

Thank you! Agreed.

>
> (I would also be -1 for making the MDEK available to the user in any way
> other than it sitting in the encrypted storage file where it is wrapped.
> If they wish to unwrap the MDEK from there with the KEK, that would be
> their choice.)

Agreed.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fabien COELHO 2019-07-26 16:13:23 Re: psql FETCH_COUNT feature does not work with combined queries
Previous Message Anastasia Lubennikova 2019-07-26 14:56:05 Re: Optimze usage of immutable functions as relation