Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)

From: Sehrope Sarkuni <sehrope(at)jackdb(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Joe Conway <mail(at)joeconway(dot)com>, Antonin Houska <ah(at)cybertec(dot)at>, Stephen Frost <sfrost(at)snowman(dot)net>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, "Moon, Insung" <Moon_Insung_i3(at)lab(dot)ntt(dot)co(dot)jp>, Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)
Date: 2019-08-07 15:41:51
Message-ID: CAH7T-aopo_wb_wabSfVNWrxti0bk-741R-gJc0yh+a9bvHGTew@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Aug 7, 2019 at 7:19 AM Bruce Momjian <bruce(at)momjian(dot)us> wrote:

> On Wed, Aug 7, 2019 at 05:13:31PM +0900, Masahiko Sawada wrote:
> > I understood. IIUC in your approach postgres processes encrypt WAL
> > records when inserting to the WAL buffer. So WAL data is encrypted
> > even on the WAL buffer.
>

I was originally thinking of not encrypting the shared WAL buffers but that
may have issues. If the buffers are already encrypted and contiguous in
shared memory, it's possible to write out many via a single pg_pwrite(...)
call as is currently done in XLogWrite(...).

If they're not encrypted you'd need to do more work in that critical
section. That'd involve allocating a commensurate amount of memory to hold
the encrypted pages and then encrypting them all prior to the single
pg_pwrite(...) call. Reusing one buffer is possible but it would require
encrypting and writing the pages one by one. Both of those seem like a bad
idea.

Better to pay the encryption cost at the time of WAL record creation and
keep the writing process as fast and simple as possible.

> > It works but I think the implementation might be complex; For example
> > using openssl, we would use EVP functions to encrypt data by
> > AES-256-CTR. We would need to make IV and pass it to them and these
> > functions however don't manage the counter value of nonce as long as I
> > didn't miss. That is, we need to calculate the correct counter value
> > for each encryption and pass it to EVP functions. Suppose we encrypt
> > 20 bytes of WAL. The first 16 bytes is encrypted with nonce of
> > (segment_number, 0) and the next 4 bytes is encrypted with nonce of
> > (segment_number, 1). After that suppose we encrypt 12 bytes of WAL. We
> > cannot use nonce of (segment_number, 2) but should use nonce of
> > (segment_number , 1). Therefore we would need 4 bytes padding and to
> > encrypt it and then to throw that 4 bytes away .
>
> Since we want to have per-byte control over encryption, for both
> heap/index pages (skip LSN and CRC), and WAL (encrypt to the last byte),
> I assumed we would need to generate a bit stream of a specified size and
> do the XOR ourselves against the data. I assume ssh does this, so we
> would have to study the method.
>

The lower level non-EVP OpenSSL functions allow specifying the offset
within the 16-byte AES block from which the encrypt/decrypt should proceed.
It's the "num" parameter of their encrypt/decrypt functions. For a
continuous encrypted stream such as a WAL file, a "pread(...)" of a
possibly non-16-byte aligned section would involve determining the 16-byte
counter (byte_offset / 16) and the intra-block offset (byte_offset % 16).
I'm not sure how one handles initializing the internal encrypted counter
and that might be one more step that would need be done. But it's
definitely possible to read / write less than a block via those APIs (not
the EVP ones).

I don't think the EVP functions have parameters for the intra-block offset
but you can mimic it by initializing the IV/block counter and then skipping
over the intra-block offset by either reading or writing a dummy partial
block. The EVP read and write functions both deal with individual bytes so
once you've seeked to your desired offset you can read or write the real
individual bytes.

Regards,
-- Sehrope Sarkuni
Founder & CEO | JackDB, Inc. | https://www.jackdb.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2019-08-07 15:47:31 Re: Handling RestrictInfo in expression_tree_walker
Previous Message Tom Lane 2019-08-07 15:39:32 Re: is necessary to recheck cached data in fn_extra?