Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Joe Conway <mail(at)joeconway(dot)com>, Antonin Houska <ah(at)cybertec(dot)at>, Stephen Frost <sfrost(at)snowman(dot)net>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, "Moon, Insung" <Moon_Insung_i3(at)lab(dot)ntt(dot)co(dot)jp>, Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)
Date: 2019-07-25 20:03:43
Message-ID: 20190725200343.xo4dcjm5azrfn6zr@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jul 25, 2019 at 03:43:34PM -0400, Alvaro Herrera wrote:
> On 2019-Jul-15, Bruce Momjian wrote:
>
> > Uh, if someone modifies a few bytes of the page, we will decrypt it, but
> > the checksum (per-page or WAL) will not match our decrypted output. How
> > would they make it match the checksum without already knowing the key.
> > I read [1] but could not see that explained.
> >
> > This post discussed it:
> >
> > https://crypto.stackexchange.com/questions/202/should-we-mac-then-encrypt-or-encrypt-then-mac
>
> I find all the discussion downthread from this post pretty confusing.

Agreed.

> Why are we encrypting the page header in the first place? It seems to
> me that the encrypted area should cover only the line pointers and the
> tuple data area; the page header needs to be unencrypted so that it can
> be used at all: firstly because you need to obtain the LSN from it in

Yes, the plan was to not encrypt the first 16 bytes so the LSN was visible.

> order to compute the IV, and secondly because the checksum must be
> validated *before* decrypting (per Moxie Marlinspike's "cryptographic
> doom" principle mentioned in a comment in the SE question).

Uh, I think we are still on the fence about writing the checksum _after_
encryption, but I think we are leaning against that, meaning online or
offline encryption must be able to decrypt the page. Since we will
already need an offline tool to enable/remove encryption anyway, it
seems we can just reuse that code for pg_checksums.

I think we have three options with for CRC:

1. compute CRC and then encrypt everything

2 encrypt and then CRC, and store the CRC unchanged

3. encrypt and then CRC, and store the CRC encrypted

The only way offline tools can verify the CRC without access to the keys
is via #2, but #2 gives us _no_ detection of tampering. I realize the
CRC tampering detection of #1 and #3 is not great, but it certainly has
some value.

> I am not totally clear on whether the special space and the "page hole"
> need to be encrypted. I tend to think that they should *not* be
> encrypted; in particular, encrypting a large area containing zeroes seem
> a plentiful source of known cleartext, which seems a bad thing. Special
> space also seems to contain known cleartext; maybe not as much as the
> page hole, but still seems better avoided.

Uh, there are no known attacks on AES with known plain-text, e.g., SSL
uses AES, so I think we are good with encrypting everything after the
first 16 bytes.

> Given this, it seems to me that we should first encrypt those two data
> areas, and *then* compute the CRC on the complete page just like we do
> today ... and the result is stored in an unencrypted area (the page
> header) and so it doesn't affect the encryption.

Yes, that is a possibility.

> The checksum we currently have is not cryptographically secure -- it's
> not a crypto-strong signature. If we want that, we need some further
> protection. Maybe for encrypted tables we replace our current checksum
> with an cryptographically secure signature ...? Pretty sure 16 bits are
> insufficient for that, but I suppose we would just use a different page
> header with room for a proper sig.

Yes, checksum is more for best-effort than fully secure, but replay of
pages makes a fullly secure solution hard anyway.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Ancient Roman grave inscription +

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message James Coleman 2019-07-25 20:14:48 [DOC] Document auto vacuum interruption
Previous Message Stephen Frost 2019-07-25 19:55:01 Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)