Re: storing an explicit nonce

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Bruce Momjian <bruce(at)momjian(dot)us>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Tom Kincaid <tomjohnkincaid(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>
Subject: Re: storing an explicit nonce
Date: 2021-05-26 18:37:05
Message-ID: 20210526183705.GG20766@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Greetings,

* Robert Haas (robertmhaas(at)gmail(dot)com) wrote:
> On Tue, May 25, 2021 at 7:58 PM Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> > The simple thought I had was masking them out, yes. No, you can't
> > re-encrypt a different page with the same nonce. (Re-encrypting the
> > exact same page with the same nonce, however, just yields the same
> > cryptotext and therefore is fine).
>
> In the interest of not being viewed as too much of a naysayer, let me
> first reiterate that I am generally in favor of TDE going forward and
> am not looking to throw up unnecessary obstacles in the way of making
> that happen.

Quite glad to hear that. Hopefully we'll all be able to get on the same
page to move TDE forward.

> That said, I don't see how this particular idea can work. When we want
> to write a page out to disk, we need to identify which bits in the
> page are hint bits, so that we can avoid including them in what is
> encrypted, which seems complicated and expensive. But even worse, when
> we then read a page back off of disk, we'd need to decrypt everything
> except for the hint bits, but how do we know which bits are hint bits
> if the page isn't decrypted yet? We can't annotate an 8kB page that
> might be full with enough extra information to say where the
> non-encrypted parts are and still have the result be guaranteed to fit
> within 8kb.

Yeah, Andres pointed that out and it's certainly an issue with this
general idea.

> Also, it's not just hint bits per se, but anything that would cause us
> to use MarkBufferDirtyHint(). For a btree index, per _bt_check_unique
> and _bt_killitems, that includes the entire line pointer array,
> because of how ItemIdMarkDead() is used. Even apart from the problem
> of how decryption would know which things we encrypted and which
> things we didn't, I really have a hard time believing that it's OK to
> exclude the entire line pointer array in every btree page from
> encryption from a security perspective. Among other potential
> problems, that's leaking all the information an attacker could
> possibly want to have about where their known plaintext might occur in
> the page.

Also a good point.

> However, I believe that if we store the nonce in the page explicitly,
> as proposed here, rather trying to derive it from the LSN, then we
> don't need to worry about this kind of masking, which I think is
> better from both a security perspective and a performance perspective.
> There is one thing I'm not quite sure about, though. I had previously
> imagined that each page would have a nonce and we could just do
> nonce++ each time we write the page. But that doesn't quite work if
> the standby can do more writes of the same page than the master. One
> vague idea I have for fixing this is: let each page's 16-byte nonce
> consist of 8 random bytes and an 8-byte counter that will be
> incremented on every write. But, the first time a standby writes each
> page, force a "key rotation" where the 8-byte random value is replaced
> with a new one, different one from what the master is using for that
> page. Detecting this is a bit expensive, because it probably means we
> need to store the TLI that last wrote each page on every page too, but
> maybe it could be made to work; we're talking about a feature that is
> expensive by nature. However, I'm a little worried about the
> cryptographic properties of this approach. It would often mean that an
> attacker who has full filesystem access can get multiple encrypted
> images of the same data, each encrypted with a different nonce. I
> don't know whether that's a hazard or not, but it feels like the sort
> of thing that, if I were a cryptographer, I would be pleased to have.

I do agree that, in general, this is a feature that's expensive to begin
with and folks are generally going to be accepting of that. Encrypting
the same data with different nonces will produce different results and
shouldn't be an issue. The nonces really do need to be unique for a
given key though.

> Another idea might be - instead of doing nonce++ every time we write
> the page, do nonce=random(). That's eventually going to repeat a
> value, but it's extremely likely to take a *super* long time if there
> are enough bits. A potentially rather large problem, though, is that
> generating random numbers in large quantities isn't very cheap.

There's specific discussion about how to choose a nonce in NIST
publications and using a properly random one that's large enough is
one accepted approach, though my recollection was that the preference
was to use an incrementing guaranteed-unique nonce and using a random
one was more of a "if you can't coordinate using an incrementing one
then you can do this". I can try to hunt for the specifics on that
though.

The issue of getting large amounts of cryptographically random numbers
seems very likely to make this not work so well though.

> Anybody got a better idea?

If we stipulate (and document) that all replicas need their own keys
then we no longer need to worry about nonce re-use between the primary
and the replica. Not sure that's *better*, per se, but I do think it's
worth consideration. Teaching pg_basebackup how to decrypt and then
re-encrypt with a different key wouldn't be challenging.

> I really like your (Stephen's) idea of including something in the
> special space that permits integrity checking. One thing that is quite
> nice about that is we could do it first, as an independent patch,
> before we did TDE. It would be an independently useful feature, and it
> would mean that if there are any problems with the code that injects
> stuff into the special space, we could try to track those down in a
> non-TDE context. That's really good, because in a TDE context, the
> pages are going to be garbled and unreadable (we hope, anyway). If we
> have a problem that we can reproduce with just an integrity-checking
> token shoved into every page, you can look at the page and try to
> understand what went wrong. So I really like this direction both from
> the point of view of improving integrity checking, and also from the
> point of view of being able to debug problems.

I agree with all of this.

> Now, one downside of this approach is that if we have the ability to
> turn integrity-checking tokens on and off, and separately we can turn
> encryption on and off, then we can't simplify down to two cases as
> Andres was advocating above; you have to cater to a variety of
> possible values of how-much-stuff-we-squeezed-into-the-special space.
> At that point you kind of end up with the approach the draft patches
> were already taking, which Andres was worried would be expensive.

Yes, if the amount of space available is variable then there's an added
cost for that. While I appreciate the concern about having that be
expensive, for my 2c at least, I like to think that having this sudden
space that's available for use may lead to other really interesting
capabilities beyond the ones we're talking about here, so I'm not really
thrilled with the idea of boiling it down to just two cases.

> I am not entirely certain, however, that I understand what the
> proposal is here exactly for integrity verification. I Googled
> "AES-GCM using/storing tags" but it didn't help me that much, because
> I don't really know the subject area. A really simple integrity
> verifier for a page would be to store the db OID, ts OID, relfilenode,
> and block number in the page, and check them on read, preventing
> blocks from moving around without us noticing. But I gather that
> perhaps the idea here is to store something like
> hash(db_oid||ts_oid||relfilenode||block||block_contents) in each page,
> basically a beefed-up checksum that is too wide to fake easily. It's
> probably more complicated than that, though: I admit to having limited
> knowledge of modern cryptography.

Happy to help on this bit. Probably the simplest way to explain what's
going on here is that you have two functions- encrypt and decrypt. The
encrypt function takes: (key, nonce, plaintext) and returns (ciphertext,
tag). The decrypt function takes: (key, nonce, ciphertext, tag) and
returns: (plaintext) ... OR an error saying "data integrity check
failed".

As an example, here's a test case from NIST for AES GCM *encryption*:

Key = 31bdadd96698c204aa9ce1448ea94ae1fb4a9a0b3c9d773b51bb1822666b8f22
IV = 0d18e06c7c725ac9e362e1ce
PT = 2db5168e932556f8089a0622981d017d
AAD =
CT = fa4362189661d163fcd6a56d8bf0405a
Tag = d636ac1bbedd5cc3ee727dc2ab4a9489

key/IV (aka nonce)/PT are inputs, CT and Tag are outputs.

Then an example for AES GCM *decryption*:

Key = 4c8ebfe1444ec1b2d503c6986659af2c94fafe945f72c1e8486a5acfedb8a0f8
IV = 473360e0ad24889959858995
CT = d2c78110ac7e8f107c0df0570bd7c90c
AAD =
Tag = c26a379b6d98ef2852ead8ce83a833a7
PT = 7789b41cb3ee548814ca0b388c10b343

Key/IV/CT/Tag are inputs, PT is the output

... but, a more interesting one when considering the tag is:

Key = c997768e2d14e3d38259667a6649079de77beb4543589771e5068e6cd7cd0b14
IV = 835090aed9552dbdd45277e2
CT = 9f6607d68e22ccf21928db0986be126e
AAD =
Tag = f32617f67c574fd9f44ef76ff880ab9f
FAIL

Again, Key/IV/CT/Tag are inputs, but there's no PT output and instead
you just get FAIL and that's because the data integrity check failed.

Exactly how the tag is generated is discussed here if you're really
curious:

https://en.wikipedia.org/wiki/Galois/Counter_Mode

but the gist of that is that it's done as part of the encryption. Note
that you can include additional data beyond just what you're encrypting
in the tag. In our case, we would probably include the LSN, which would
mean that the LSN would be confirmed to be correct additional
information that wasn't actually encrypted. The "AAD" above is
"Additional Authenticated Data".

One thing to be absolutely clear about here though is that simply taking
a hash() of the ciphertext and storing that with the data does *not*
provide cryptographic data integrity validation for the page because it
doesn't involve the actual key or IV at all and the hash is done after
the ciphertext is generated- therefore an attacker can change the data
and just change the hash to match and you'd never know.

Now, when it comes to hashing the *plaintext* data and storing that, you
have to be careful ther because you can very easily fall into the trap
of giving away information about the plaintext data that way if an
attacker can reason about what the plaintext might look like. If I know
the block contains just a single english word and all we've done is
sha256'd it then I can just run sha256 on all english words and figure
out what it is, so to protect the data you need to incorporate the key,
nonce, etc, somehow into the hash (that is- something that should be
very hard for the attacker to discover) and suddently you're doing what
AES-GCM *already* does for you, except you're trying to hack it yourself
instead of using the tools available which were written by experts.

The way that I tend to look at this area is that everyone used to try
and do encryption and data integrity independently and the result was a
bunch of different implementations, some good, some bad (and therefore
leaked sensitive information) and the crypto folks basically said "ok,
let's take the *good* implementations and bake that in, because
otherwise people are going to just keep screwing up and using bad
approaches for this."

What this means for your proposal above is that the actual data
validation information will be generated in two different ways depending
on if we're using AES-GCM and doing TDE, or if we're doing just the data
validation piece and not encrypting anything. That's maybe not ideal
but I don't think it's a huge issue either and your proposal will still
address the question of if we end up missing anything when it comes to
how the special area is handled throughout the code.

If it'd help, I'd be happy to jump on a call to discuss further. Also
happy to continue on this thread too, of course.

Thanks,

Stephen

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2021-05-26 19:31:28 Re: Move pg_attribute.attcompression to earlier in struct for reduced size?
Previous Message Robert Haas 2021-05-26 17:56:38 Re: storing an explicit nonce