Re: storing an explicit nonce

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Bruce Momjian <bruce(at)momjian(dot)us>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Tom Kincaid <tomjohnkincaid(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>
Subject: Re: storing an explicit nonce
Date: 2021-05-26 17:56:38
Message-ID: CA+TgmoZ5bK89mG7KO8Bp0qHS2w11j_-YHC=nV0Bs9oGqpqk6=w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, May 25, 2021 at 7:58 PM Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> The simple thought I had was masking them out, yes. No, you can't
> re-encrypt a different page with the same nonce. (Re-encrypting the
> exact same page with the same nonce, however, just yields the same
> cryptotext and therefore is fine).

In the interest of not being viewed as too much of a naysayer, let me
first reiterate that I am generally in favor of TDE going forward and
am not looking to throw up unnecessary obstacles in the way of making
that happen.

That said, I don't see how this particular idea can work. When we want
to write a page out to disk, we need to identify which bits in the
page are hint bits, so that we can avoid including them in what is
encrypted, which seems complicated and expensive. But even worse, when
we then read a page back off of disk, we'd need to decrypt everything
except for the hint bits, but how do we know which bits are hint bits
if the page isn't decrypted yet? We can't annotate an 8kB page that
might be full with enough extra information to say where the
non-encrypted parts are and still have the result be guaranteed to fit
within 8kb.

Also, it's not just hint bits per se, but anything that would cause us
to use MarkBufferDirtyHint(). For a btree index, per _bt_check_unique
and _bt_killitems, that includes the entire line pointer array,
because of how ItemIdMarkDead() is used. Even apart from the problem
of how decryption would know which things we encrypted and which
things we didn't, I really have a hard time believing that it's OK to
exclude the entire line pointer array in every btree page from
encryption from a security perspective. Among other potential
problems, that's leaking all the information an attacker could
possibly want to have about where their known plaintext might occur in
the page.

However, I believe that if we store the nonce in the page explicitly,
as proposed here, rather trying to derive it from the LSN, then we
don't need to worry about this kind of masking, which I think is
better from both a security perspective and a performance perspective.
There is one thing I'm not quite sure about, though. I had previously
imagined that each page would have a nonce and we could just do
nonce++ each time we write the page. But that doesn't quite work if
the standby can do more writes of the same page than the master. One
vague idea I have for fixing this is: let each page's 16-byte nonce
consist of 8 random bytes and an 8-byte counter that will be
incremented on every write. But, the first time a standby writes each
page, force a "key rotation" where the 8-byte random value is replaced
with a new one, different one from what the master is using for that
page. Detecting this is a bit expensive, because it probably means we
need to store the TLI that last wrote each page on every page too, but
maybe it could be made to work; we're talking about a feature that is
expensive by nature. However, I'm a little worried about the
cryptographic properties of this approach. It would often mean that an
attacker who has full filesystem access can get multiple encrypted
images of the same data, each encrypted with a different nonce. I
don't know whether that's a hazard or not, but it feels like the sort
of thing that, if I were a cryptographer, I would be pleased to have.

Another idea might be - instead of doing nonce++ every time we write
the page, do nonce=random(). That's eventually going to repeat a
value, but it's extremely likely to take a *super* long time if there
are enough bits. A potentially rather large problem, though, is that
generating random numbers in large quantities isn't very cheap.

Anybody got a better idea?

I really like your (Stephen's) idea of including something in the
special space that permits integrity checking. One thing that is quite
nice about that is we could do it first, as an independent patch,
before we did TDE. It would be an independently useful feature, and it
would mean that if there are any problems with the code that injects
stuff into the special space, we could try to track those down in a
non-TDE context. That's really good, because in a TDE context, the
pages are going to be garbled and unreadable (we hope, anyway). If we
have a problem that we can reproduce with just an integrity-checking
token shoved into every page, you can look at the page and try to
understand what went wrong. So I really like this direction both from
the point of view of improving integrity checking, and also from the
point of view of being able to debug problems.

Now, one downside of this approach is that if we have the ability to
turn integrity-checking tokens on and off, and separately we can turn
encryption on and off, then we can't simplify down to two cases as
Andres was advocating above; you have to cater to a variety of
possible values of how-much-stuff-we-squeezed-into-the-special space.
At that point you kind of end up with the approach the draft patches
were already taking, which Andres was worried would be expensive.

I am not entirely certain, however, that I understand what the
proposal is here exactly for integrity verification. I Googled
"AES-GCM using/storing tags" but it didn't help me that much, because
I don't really know the subject area. A really simple integrity
verifier for a page would be to store the db OID, ts OID, relfilenode,
and block number in the page, and check them on read, preventing
blocks from moving around without us noticing. But I gather that
perhaps the idea here is to store something like
hash(db_oid||ts_oid||relfilenode||block||block_contents) in each page,
basically a beefed-up checksum that is too wide to fake easily. It's
probably more complicated than that, though: I admit to having limited
knowledge of modern cryptography.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2021-05-26 18:37:05 Re: storing an explicit nonce
Previous Message Zhihong Yu 2021-05-26 17:35:00 Re: Skip partition tuple routing with constant partition key