Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Joe Conway <mail(at)joeconway(dot)com>, Antonin Houska <ah(at)cybertec(dot)at>, Stephen Frost <sfrost(at)snowman(dot)net>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, "Moon, Insung" <Moon_Insung_i3(at)lab(dot)ntt(dot)co(dot)jp>, Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)
Date: 2019-07-27 17:31:49
Message-ID: 20190727173149.wgt374p6ivltq6dz@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jul 25, 2019 at 10:57:08PM -0400, Alvaro Herrera wrote:
> On 2019-Jul-25, Bruce Momjian wrote:
>
> > On Thu, Jul 25, 2019 at 03:43:34PM -0400, Alvaro Herrera wrote:
>
> > > Why are we encrypting the page header in the first place? It seems to
> > > me that the encrypted area should cover only the line pointers and the
> > > tuple data area; the page header needs to be unencrypted so that it can
> > > be used at all: firstly because you need to obtain the LSN from it in
> >
> > Yes, the plan was to not encrypt the first 16 bytes so the LSN was visible.
>
> I don't see the value of encrypting the rest of the page header
> (which includes the page checksum).

Well, let's unpack this. Encrypting the page in more finely grained
parts than 16-bytes is going to require the use of CTR, but I think we
are leaning toward that anyway.

One advantage of not encrypting the hole is that it might be faster, but
I think it might reduce parallelism possibilities, so it might be
slower. This might need testing.

No encrypting the hold does leak the size of the hole to the attacker,
but the size of the table is also visible to the attacker, so I don't
know if the hole size helps. Knowing index hole size might be useful to
an attacker --- not sure.

> > > order to compute the IV, and secondly because the checksum must be
> > > validated *before* decrypting (per Moxie Marlinspike's "cryptographic
> > > doom" principle mentioned in a comment in the SE question).
> >
> > Uh, I think we are still on the fence about writing the checksum _after_
> > encryption,
>
> I don't see what's the reason for doing that. The "cryptographic doom
> principle" page talks about this kind of scenario, and ISTM that the
> ultimate suggestion is that the page checksum ought to be verifyable
> prior to doing any decryption.

Uh, I listed the three options for the CRC and gave the benefits of
each:

https://www.postgresql.org/message-id/20190725200343.xo4dcjm5azrfn6zr@momjian.us

Obviously I was not clear on the benefits. To quote:

1. compute CRC and then encrypt everything
3. encrypt and then CRC, and store the CRC encrypted

Numbers 1 & 3 give us tampering detection, though with the CRC being so
small, it isn't totally secure.

> Are you worried about an attacker forging the page checksum by
> installing another encrypted page that gives the same checksum? I'm not
> sure how that attack works ... I mean why can the attacker install
> arbitrary pages?

Well, with #2

2 encrypt and then CRC, and store the CRC unchanged

you can modify the page, even small parts, and just replace the CRC to
match your changes. In #1 and #3, you would get a CRC error in almost
all cases since you have no way of setting the decrypted CRC without
knowing the key. You can change the encrypted CRC, but the odds that
the decrypted one would match the page is very slim.

> > The only way offline tools can verify the CRC without access to the keys
> > is via #2, but #2 gives us _no_ detection of tampering. I realize the
> > CRC tampering detection of #1 and #3 is not great, but it certainly has
> > some value.
>
> It seems to me that you're trying to invent a cryptographic signature
> scheme on your own. That seems very likely to backfire.

Well, we have to live within the constraints we have. The question is
whether there is sufficient value to having such tampering detection (#1
& #3) compared to the ease of having offline tools verify the checksums
without need to access the keys (#2).

> > > I am not totally clear on whether the special space and the "page hole"
> > > need to be encrypted. I tend to think that they should *not* be
> > > encrypted; in particular, encrypting a large area containing zeroes seem
> > > a plentiful source of known cleartext, which seems a bad thing. Special
> > > space also seems to contain known cleartext; maybe not as much as the
> > > page hole, but still seems better avoided.
> >
> > Uh, there are no known attacks on AES with known plain-text, e.g., SSL
> > uses AES, so I think we are good with encrypting everything after the
> > first 16 bytes.
>
> Well, maybe there aren't any attacks *now*, but I don't know what will
> happen in the future. I'm not clear what's the intended win by
> encrypting the all-zeroes page hole anyway. If you leave it
> unencrypted, the attacker knows the size of the hole, as well as the
> size of the tuple data area and the size of the LP array. Is that a
> side-channer that leaks much?

See above.

> > > The checksum we currently have is not cryptographically secure -- it's
> > > not a crypto-strong signature. If we want that, we need some further
> > > protection. Maybe for encrypted tables we replace our current checksum
> > > with an cryptographically secure signature ...? Pretty sure 16 bits are
> > > insufficient for that, but I suppose we would just use a different page
> > > header with room for a proper sig.
> >
> > Yes, checksum is more for best-effort than fully secure, but replay of
> > pages makes a fully secure solution hard anyway.
>
> What do you mean with "replay of pages"?

Someone can replace the entire page with an old copy of the page they
saved, and since they didn't modify the page, even for #1 and #3, the
checksum would match, unless the encryption key has been rotated.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Ancient Roman grave inscription +

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2019-07-27 17:33:36 Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)
Previous Message Sehrope Sarkuni 2019-07-27 17:06:06 Re: fsync error handling in pg_receivewal, pg_recvlogical