Re: XTS cipher mode for cluster file encryption

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>, Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru>
Cc: Sasasu <i(at)sasa(dot)su>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Bruce Momjian <bruce(at)momjian(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: XTS cipher mode for cluster file encryption
Date: 2021-10-26 21:11:39
Message-ID: f51aad97-5f3d-0a61-69d9-2c297ea934e1@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 10/26/21 21:43, Stephen Frost wrote:
> Greetings,
>
> * Yura Sokolov (y(dot)sokolov(at)postgrespro(dot)ru) wrote:
>> ... >>
>> Integrity could be based on simple non-cryptographic checksum, and it could
>> be checked after decryption. It would be imposible to intentionally change
>> encrypted page in a way it will pass checksum after decription.
>
> No, it wouldn't be impossible when we're talking about non-cryptographic
> checksums. That is, in fact, why you'd call them that. If it were
> impossible (or at least utterly impractical) then you'd be able to claim
> that it's cryptographic-level integrity validation.
>

Yeah, our checksums are probabilistic protection against rare and random
bitflips cause by hardware, not against an attacker in the crypto sense.

To explain why it's not enough, consider our checksum is uint16, i.e.
there are only 64k possible values. In other words, you can try flipping
bits in the encrypted page, and after generating 64k you're guaranteed
to have at least one collision. Yes, it's harder to get collision with
the existing checksum, and compression methods that diffuse bits better
makes it harder to get a valid page after decryption, but it's simply
not the same thing as a crypto integrity.

Let's not try inventing something custom, there's been enough crypto
failures due to smart custom stuff in the past already.

BTW I'm not sure what the existing patches do, but I wonder if we should
calculate the checksum before or after encryption. I'd say it should be
after encryption, because checksums were meant as a protection against
issues at the storage level, so the checksum should be on what's written
to storage, and it'd also allow offline verification of checksums etc.
(Of course, that'd make the whole idea of relying on our checksums even
more futile.)

Note: Maybe there are reasons why the checksum needs to be calculated
before encryption, not sure.

>> Currently we have 16bit checksum, and it is very small. But having larger
>> checksum is orthogonal (ie doesn't bound) to having encryption.
>
> Sure, but that would also require a page-format change. We've pointed
> out the downsides of that and what it would prevent in terms of
> use-cases. That's still something that might happen but it would be a
> different effort from this.
>

... and if such page format ends up happening, it'd be fairly easy to
just add some extra crypto data into the page header and not rely on the
data checksums at all.

>> In fact, Adiantum is easily made close to SIV construction:
>> - just leave last 8/16 bytes zero. If after decription they are zero,
>> then integrity check passed.
>> That is because SIV and Adiantum are very similar in its structure:
>> - SIV:
>> -- hash
>> -- then stream cipher
>> - Adiantum:
>> -- hash (except last 16bytes)
>> -- then encrypt last 16bytes with hash,
>> -- then stream cipher
>> -- then hash.
>> If last N (N>16) bytes is nonce + zero bytes, then "hash, then encrypt last
>> 16bytes with hash" become equivalent to just "hash", and Adiantum became
>> logical equivalent to SIV.
>
> While I appreciate your interest in this, I don't think it makes sense
> for us to try and implement something of our own- we're not
> cryptographers. Best is to look at published guideance and what other
> projects have had success doing, and that's what this thread has been
> about.
>

Yeah, I personally don't see much difference between XTS and Adiantum.

There are a bunch of benefits, but the main reason why Google developed
it seems to be performance on low-end ARM machines (i.e. phones). Which
is nice, but it's probably not hugely important - very few people run Pg
on such machines, especially in performance-sensitive context.

It's true Adiantum is probably more resilient to IV reuse etc. but it's
not like XTS is suddenly obsolete, and it certainly doesn't solve the
integrity issue etc.

>>>> - like XTS it doesn't need to change plain text format and doesn't need in
>>>> additional Nonce/Auth Code.
>>>
>>> Sure, in which case it's something that could potentially be added later
>>> as another option in the future. I don't think we'll always have just
>>> one encryption method and it's good to generally think about what it
>>> might look like to have others but I don't think it makes sense to try
>>> and get everything in all at once.
>>
>> And among others Adiantum looks best: it is fast even without hardware
>> acceleration, it provides whole block encryption (ie every bit depends
>> on every bit) and it doesn't bound to plain-text format.
>
> And it could still be added later as another option if folks really want
> it to be. I've outlined why it makes sense to go with XTS first but I
> don't mean that to imply that we'll only ever have that. Indeed, once
> we've actually got something, adding other methods will almost certainly
> be simpler. Trying to do everything from the start will make this very
> difficult to accomplish though.
>

Yeah.

So maybe the best thing is simply to roll with both - design the whole
feature in a way that allows selecting the encryption scheme, with two
options. That's generally a good engineering practice, as it ensures
things are not coupled too much. And it's not like the encryption
methods are expected to be super difficult.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2021-10-26 21:39:30 Re: XTS cipher mode for cluster file encryption
Previous Message Jeff Davis 2021-10-26 21:04:14 Re: Allow pg_signal_backend members to use pg_log_backend_memory_stats().