Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Sehrope Sarkuni <sehrope(at)jackdb(dot)com>
Cc: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Joe Conway <mail(at)joeconway(dot)com>, Antonin Houska <ah(at)cybertec(dot)at>, Stephen Frost <sfrost(at)snowman(dot)net>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, "Moon, Insung" <Moon_Insung_i3(at)lab(dot)ntt(dot)co(dot)jp>, Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)
Date: 2019-08-07 17:38:56
Message-ID: 20190807173856.jz5g3esknlovunxw@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Aug 7, 2019 at 11:41:51AM -0400, Sehrope Sarkuni wrote:
> On Wed, Aug 7, 2019 at 7:19 AM Bruce Momjian <bruce(at)momjian(dot)us> wrote:
>
> On Wed, Aug  7, 2019 at 05:13:31PM +0900, Masahiko Sawada wrote:
> > I understood. IIUC in your approach postgres processes encrypt WAL
> > records when inserting to the WAL buffer. So WAL data is encrypted
> > even on the WAL buffer.
>
>
> I was originally thinking of not encrypting the shared WAL buffers but that may
> have issues. If the buffers are already encrypted and contiguous in shared
> memory, it's possible to write out many via a single pg_pwrite(...) call as is
> currently done in XLogWrite(...).

The shared buffers will not be encrypted --- they are encrypted only
when being written to storage. We felt encrypting shared buffers will
be too much overhead, for little gain. I don't know if we will encrypt
while writing to the WAL buffers or while writing the WAL buffers to
the file system.

> If they're not encrypted you'd need to do more work in that critical section.
> That'd involve allocating a commensurate amount of memory to hold the encrypted
> pages and then encrypting them all prior to the single pg_pwrite(...) call.
> Reusing one buffer is possible but it would require encrypting and writing the
> pages one by one. Both of those seem like a bad idea.

Well, right now the 8k pages is part of the WAL stream, so I don't know
it would be any more overhead than other WAL writes. I am hoping we can
generate the encryption bit stream in chunks earlier so we can just to
the XOR was we are writing the data to the WAL buffers.

> Better to pay the encryption cost at the time of WAL record creation and keep
> the writing process as fast and simple as possible.

Yes, I don't think we know at the time of WAL record creation what
_offset_ the records will have when then are written to WAL, so I am
thinking we need to do it later, and as I said, I am hoping we can
generate the encryption bit stream earlier.

> > It works but I think the implementation might be complex; For example
> > using openssl, we would use EVP functions to encrypt data by
> > AES-256-CTR. We would need to make IV and pass it to them and these
> > functions however don't manage the counter value of nonce as long as I
> > didn't miss. That is, we need to calculate the correct counter value
> > for each encryption and pass it to EVP functions. Suppose we encrypt
> > 20 bytes of WAL. The first 16 bytes is encrypted with nonce of
> > (segment_number, 0) and the next 4 bytes is encrypted with nonce of
> > (segment_number, 1). After that suppose we encrypt 12 bytes of WAL. We
> > cannot use nonce of (segment_number, 2) but should use nonce of
> > (segment_number , 1). Therefore we would need 4 bytes padding and to
> > encrypt it and then to throw that 4 bytes away .
>
> Since we want to have per-byte control over encryption, for both
> heap/index pages (skip LSN and CRC), and WAL (encrypt to the last byte),
> I assumed we would need to generate a bit stream of a specified size and
> do the XOR ourselves against the data.  I assume ssh does this, so we
> would have to study the method.
>
>
> The lower level non-EVP OpenSSL functions allow specifying the offset within
> the 16-byte AES block from which the encrypt/decrypt should proceed. It's the
> "num" parameter of their encrypt/decrypt functions. For a continuous encrypted
> stream such as a WAL file, a "pread(...)" of a possibly non-16-byte aligned
> section would involve determining the 16-byte counter (byte_offset / 16) and
> the intra-block offset (byte_offset % 16). I'm not sure how one handles
> initializing the internal encrypted counter and that might be one more step
> that would need be done. But it's definitely possible to read / write less than
> a block via those APIs (not the EVP ones).
>
> I don't think the EVP functions have parameters for the intra-block offset but
> you can mimic it by initializing the IV/block counter and then skipping over
> the intra-block offset by either reading or writing a dummy partial block. The
> EVP read and write functions both deal with individual bytes so once you've
> seeked to your desired offset you can read or write the real individual bytes.

Can we generate the bit stream in 1MB chunks or something and just XOR
as needed?

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Ancient Roman grave inscription +

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Justin Pryzby 2019-08-07 18:27:44 Re: crash 11.5~ (and 11.4)
Previous Message Justin Pryzby 2019-08-07 17:30:13 crash 11.5~