Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)

From: Sehrope Sarkuni <sehrope(at)jackdb(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Joe Conway <mail(at)joeconway(dot)com>, Antonin Houska <ah(at)cybertec(dot)at>, Stephen Frost <sfrost(at)snowman(dot)net>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, "Moon, Insung" <Moon_Insung_i3(at)lab(dot)ntt(dot)co(dot)jp>, Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)
Date: 2019-08-09 02:17:53
Message-ID: CAH7T-aq7y=rYfK6ryHX-S76+-3EDoyE8i0xKcxqnBdWE60ZT_Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Aug 8, 2019 at 2:16 PM Bruce Momjian <bruce(at)momjian(dot)us> wrote:

> On Wed, Aug 7, 2019 at 08:56:18AM -0400, Sehrope Sarkuni wrote:
> > Simplest approach for derived keys would be to use immutable attributes
> of the
> > WAL files as an input to the key derivation. Something like HKDF(MDEK,
> "WAL:" |
>
> So, I am thinking we should use "WAL:" for WAL and "REL:" for heap/index
> files.
>

Sounds good. Any unique convention is fine. Main thing to keep in mind is
that they're directly tied to the master key so it's not possible to rotate
them without changing the master key.

This is in contrast to saving a WDEK key to a file (similar to how the MDEK
key would be saved) and unlocking it with the MDEK. That has more moving
parts but would allow that key to be independent of the MDEK. In a later
message Stephen refers to an example of a replica receiving encrypted WAL
and applying it with a different MDEK for the page buffers. That's doable
with an independent WDEK.

> > | timeline_id || wal_segment_num) should be fine for this as it is:
>
> I considered using the timeline in the nonce, but then remembered that
> in timeline switch, we _copy_ the part of the old WAL up to the timeline
> switch to the new timeline; see:
>
>
> https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/access/transam/xlog.c;h=f55352385732c6b0124eff5265462f3883fe7435;hb=HEAD#l5502
>
> * Initialize the starting WAL segment for the new timeline. If the
> switch
> * happens in the middle of a segment, copy data from the last WAL
> segment
> * of the old timeline up to the switch point, to the starting WAL
> segment
> * on the new timeline.
>
> We would need to decrypt/encrypt to do the copy, and just wasn't sure of
> the value of the timeline in the nonce. One value to it is that if
> there some WAL that generated after the timeline switch in the old
> primary that isn't transfered, there would be potentially new data
> encrypted with the same key/nonce in the new primary, but if that WAL is
> not used, odds are it is gone/destroyed/inaccessible, or it would have
> been used during the switchover, so it didn't seem worth worrying about.
>
> One _big_ reason to add the timeline is if you had a standby that you
> recovered and rolled forward only to a specific transaction, then
> continued running it as a new primary. In that case, you would have
> different WAL encrypted with the same key/nonce, but that sounds like
> the same as promoting two standbys, and we should just document not to
> do it.
>
> Maybe we need to consider this further.
>

Good points. Yes, anything short of generating a new key at promotion time
will have these issues. If we're not going to do that, no point in adding
the timeline id if it does not change anything. I had thought only the
combo was unique but sounds like the segment number is unique on its own.
One thing I like about a unique per-file key is that it simplifies the IV
generation (i.e. can start at zero).

What about discarding the rest of the WAL file at promotion and skipping to
a new file? With a random per-file key in the first page header would
ensure that going forward all WAL data is encrypted differently. Combine
that with independent WAL and MDEK keys and everything would be different
between two replicas promoted from the same point.

> > A unique WDEK per WAL file that is derived from the segment number would
> not
> > have that problem. A unique key per-file means the IVs can all start at
> zero
> > and the each file can be treated as one encrypted stream. Any encryption/
> > decryption code would only need to touch the write/read callsites.
>
> So, I am now wondering when we should be using a non-zero nonce to
> start, and when we should be using derived keys. Should we add the
> page-number to the derived key for heap/index files too and just use the
> LSN for nonce, or add the LSN to the derived key too?
>

The main cost of using multiple keys is that you need to derive or unlock
them for each usage.

A per-type, per-relation, or per-file derived key with the same
non-repeating guarantees for the IV (ex: LSN + Page Number) is as secure
but allows for caching all needed derived keys in memory (it's one per open
file descriptor).

Having page-level derived keys incorporating the LSN + Page Number and
starting the per-page IV at zero works, but you'd have to perform an HKDF
for each page read or write. A cache of those derived keys would be much
larger (32-bytes per page) so presumably you're not going to have them all
cached or maybe not bother with any caching,

> > Even without a per-page MAC, a MAC at some level for WAL has its own
> benefits
> > such as perfect corruption detection. It could be per-record,
> per-N-records,
> > per-checkpoint, or per-file. The current WAL file format already handles
> > arbitrary gaps so there is significantly more flexibility in adding it vs
> > pages. I'm not saying it should be a requirement but, unlike pages, I
> would not
> > rule it out just yet as it may not be that complicated.
>
> We already have a CRC in the WAL that detects corruption, and that would
> be encrypted, so it is a MAC.

Encrypting a CRC does not make it a cryptographic MAC. It'd have problems
similar to those discussed for the per-page CRC though it'd still be useful
for basic corruption detection.

> It is an int32, so twice as many bits as
> the heap/index page CRC --- better, but not great. It would be pretty

trivial to increase that to 64 bite if desired.
>

"64 bite" is referring to "bit" or "byte"? ;-) I'm guessing bits...

For the WAL record CRC I think it makes sense to keep the shared wal buffer
format in place and leave it on the plaintext (rather than on the cipher
text like is being proposed for the page buffers). The WAL records are not
fixed length so if the rest of the stream is encrypted there no way for a
program without the key to be able to figure out the record offsets. Would
need *some* information like the record size left in plaintext. Not sure
the implications of that.

I still think there could be a separate full MAC at some aggregated level.
Per-page seems like a good fit as that's how the writes happen and it could
be calculated just before the actual per-page write.

Regards,
-- Sehrope Sarkuni
Founder & CEO | JackDB, Inc. | https://www.jackdb.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Sehrope Sarkuni 2019-08-09 02:34:26 Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)
Previous Message Michael Paquier 2019-08-09 02:15:30 Re: Refactoring code stripping trailing \n and \r from strings