Re: XTS cipher mode for cluster file encryption

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Bruce Momjian <bruce(at)momjian(dot)us>, Sasasu <i(at)sasa(dot)su>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: XTS cipher mode for cluster file encryption
Date: 2021-10-18 15:56:03
Message-ID: 20211018155603.GR20998@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Greetings,

* Tomas Vondra (tomas(dot)vondra(at)enterprisedb(dot)com) wrote:
> On 10/15/21 21:22, Stephen Frost wrote:
> >Now, to address the concern around re-encrypting a block with the same
> >key+IV but different data and leaking what parts of the page changed, I
> >do think we should use the LSN and have it change regularly (including
> >unlogged tables) but that's just because it's relatively easy for us to
> >do and means an attacker wouldn't be able to tell what part of the page
> >changed when the LSN was also changed. That was also recommended by
> >NIST and that's a pretty strong endorsement also.
>
> Not sure - it seems a bit weird to force LSN change even in cases that don't
> generate any WAL. I was not following the encryption thread and maybe it was
> discussed/rejected there, but I've always imagined we'd have a global nonce
> generator (similar to a sequence) and we'd store it at the end of each
> block, or something like that.

The 'LSN' being referred to here isn't the regular LSN that is
associated with the WAL but rather the separate FakeLSN counter which we
already have. I wasn't suggesting having the regular LSN change in
cases that don't generate WAL.

* Robert Haas (robertmhaas(at)gmail(dot)com) wrote:
> On Fri, Oct 15, 2021 at 3:22 PM Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> > Specifically: The default cipher for LUKS is nowadays aes-xts-plain64
> >
> > and then this:
> >
> > https://gitlab.com/cryptsetup/cryptsetup/-/wikis/DMCrypt
> >
> > where plain64 is defined as:
> >
> > plain64: the initial vector is the 64-bit little-endian version of the
> > sector number, padded with zeros if necessary
> >
> > That is, the default for LUKS is AES, XTS, with a simple IV. That
> > strikes me as a pretty ringing endorsement.
>
> Yes, that sounds promising. It might not hurt to check for other
> precedents as well, but that seems like a pretty good one.
>
> I'm not very convinced that using the LSN for any of this is a good
> idea. Something that changes most of the time but not all the time
> seems more like it could hurt by masking fuzzy thinking more than it
> helps anything.

This argument doesn't come across as very strong at all to me,
particularly when we have explicit recommendations from NIST that having
the IV vary more is beneficial. While this would be using the LSN, the
fact that the LSN changes most of the time but not all of the time isn't
new and is something we already have to deal with. I'd think we'd
address the concern about mis-thinking around how this works by
providing a README and/or an appropriate set of comments around what's
being done and why.

* Andres Freund (andres(at)anarazel(dot)de) wrote:
> On 2021-10-15 15:22:48 -0400, Stephen Frost wrote:
> > * Bruce Momjian (bruce(at)momjian(dot)us) wrote:
> > > Finally, there is an interesting web page about when not to use XTS:
> > >
> > > https://sockpuppet.org/blog/2014/04/30/you-dont-want-xts/
> >
> > This particular article always struck me as more of a reason for us, at
> > least, to use XTS than to not- in particular the very first comment it
> > makes, which seems to be pretty well supported, is: "XTS is the de-facto
> > standard disk encryption mode."
>
> I don't find that line of argument *that* convincing. The reason XTS is the
> de-facto standard is that for generic block layer encryption is that you can't
> add additional data for each block without very significant overhead
> (basically needing journaling to ensure that the data doesn't get out of
> sync). But we don't really face the same situation - we *can* add additional
> data.

No, we can't always add additional data, and that's part of the
consideration for an XTS option- there are things we can do if we use
XTS that we can't with GCM or another solution. Specifically, being
able to perform physical replication from an unencrypted cluster to an
encrypted one is a worthwhile use-case that we shouldn't be just tossing
out.

> With something like AES-GCM-SIV we can use the additional data to get IV reuse
> resistance *and* authentication. And while perhaps we are ok with the IV reuse
> guarantees XTS has, it seems pretty clear that we'll want want guaranteed
> authenticity at some point. And then we'll need extra data anyway.

I agree that it'd be useful to have an authenticated encryption option.
Implementing XTS doesn't preclude us from adding that capability down
the road and it's simpler with fewer dependencies. These all strike me
as good reasons to add XTS first.

> Thus, to me, it doesn't seem worth going down the XTS route, just to
> temporarily save a bit of implementation effort. We'll have to endure that
> pain anyway.

This isn't a valid argument as it isn't just about implementation but
about the capabilities we will have once it's done.

* Tomas Vondra (tomas(dot)vondra(at)enterprisedb(dot)com) wrote:
> On 10/15/21 23:02, Robert Haas wrote:
> >On Fri, Oct 15, 2021 at 3:22 PM Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> >>That is, the default for LUKS is AES, XTS, with a simple IV. That
> >>strikes me as a pretty ringing endorsement.
> >
> >Yes, that sounds promising. It might not hurt to check for other
> >precedents as well, but that seems like a pretty good one.
>
> TrueCrypt/VeraCrypt uses XTS too, I think. There's an overview of other FDE
> products at [1], and some of them use XTS, but I would take that with a
> grain of salt - some of the products are somewhat obscure, very old, or
> both.
>
> What is probably more interesting is that there's an IEEE standard [2]
> dealing with encrypted shared storage, and that uses XTS too. I'd bet
> there's a bunch of smart cryptographers involved.

Thanks for finding those and linking to them, that's helpful.

> >I'm not very convinced that using the LSN for any of this is a good
> >idea. Something that changes most of the time but not all the time
> >seems more like it could hurt by masking fuzzy thinking more than it
> >helps anything.
>
> I haven't been following the discussion about using LSN, but I agree that
> while using it seems convenient, the consequences of some changes not
> incrementing LSN seem potentially disastrous, depending on the encryption
> mode.

Yes, this depends on the encryption mode, and is why we are specifically
talking about XTS here as it's an encryption mode that doesn't suffer
from this risk and therefore it's perfectly fine to use the LSN/FakeLSN
with XTS (and would also be alright for AES-GCM-SIV as it's specifically
designed to be resistant to IV reuse).

* Bruce Momjian (bruce(at)momjian(dot)us) wrote:
> On Fri, Oct 15, 2021 at 10:57:03PM +0200, Tomas Vondra wrote:
> > > That said, I don't think that's really a huge issue or something that's
> > > a show stopper or a reason to hold off on using XTS. Note that what
> > > those bits actually *are* isn't leaked, just that they changed in some
> > > fashion inside of that 16-byte cipher text block. That they're directly
> > > leaked with CTR is why there was concern raised about using that method,
> > > as discussed above and previously.
> >
> > Yeah. With CTR you pretty learn where the hint bits are exactly, while with
> > XTS the whole ciphertext changes.
> >
> > This also means CTR is much more malleable, i.e. you can tweak the
> > ciphertext bits to flip the plaintext, while with XTS that's not really
> > possible - it's pretty much guaranteed to break the block structure. Not
> > sure if that's an issue for our use case, but if it is then neither of the
> > two modes is a solution.
>
> Yes, this is a vary good point. Let's look at the impact of _not_ using
> the LSN. For CTR (already rejected) bit changes would be visible by
> comparing old/new page contents. For CBC (also not under consideration)
> the first 16-byte block would show a change, and all later 16-byte
> blocks would show a change. For CBC, you see the 16-byte blocks change,
> but you have no idea how many bits were changed, and in what locations
> in the 16-byte block (AES uses substitution and diffusion). For XTS,
> because earlier blocks don't change the IV used by later blocks like
> CBC, you would be able to see each 16-byte block that changed in the 8k
> page. Again, you would not know the number of bits changed or their
> locations.
>
> Do we think knowing which 16-byte blocks on an 8k page change would leak
> useful information? If so, we should use the LSN and just accept that
> some cases might leak as described above. If we don't care, then we can
> skip the use of the LSN and simplify the patch.

While there may not be an active attack against PG that leverages such a
leak, I have a hard time seeing why we would intentionally design this
in when we have a great option that's directly available to us and
doesn't cause such a leak with nearly such regularity as not using the
LSN would, and also follows recommendations of using XTS from NIST.
Further, not using the LSN wouldn't really be an option if we did
eventually implement AES-GCM-SIV, so why not have the two cases be
consistent?

> > Not sure - it seems a bit weird to force LSN change even in cases that don't
> > generate any WAL. I was not following the encryption thread and maybe it was
> > discussed/rejected there, but I've always imagined we'd have a global nonce
> > generator (similar to a sequence) and we'd store it at the end of each
> > block, or something like that.
>
> Storing the nonce in the page means more code complexity, possible
> performance impact, and the inability to create standbys via binary
> replication that use cluster file encryption.

Right- the point of XTS is that we can do things that we can't with GCM
and that it's simpler.

> As a final comment to Andres's email, adding a GCM has the problems
> above, plus it wouldn't detect changes to pg_xact, fsm, vm, etc, which
> could also affect the integrity of the data. Someone could also restore
> and old copy of a patch to revert a change, and that would not be
> detected even by GCM.

That's an argument based on how things stand today. I appreciate that
it's no small thing to consider changes to those other systems but I
would argue that having authentication of the heap is still better than
not (but also agree that XTS is simpler to implement and therefore makes
sense to do first and see how things stand after that's done). Surely,
we would want progress made here to be done so incrementally as a patch
that attempted to change all of those other systems to be encrypted and
authenticated with AES-GCM-SIV would be far too large to consider in one
shot anyway.

> I consider this a checkbox feature and making it too complex will cause
> it to be rightly rejected.

Presuming that 'checkbox feature' here means "we need it to please
$someone but no one will ever use it" or something along those lines,
this is very clearly not the case and therefore we shouldn't be
describing it or treating it as such. Even if the meaning here is
"there's other ways people could get this capability" the reality is
that those other methods are simply not always available and in those
cases, people will choose to not use PostgreSQL. Nearly every other
database system which we might compare ourselves to has a solution in
this area and people actively use those solutions in a lot of
deployments.

* Andres Freund (andres(at)anarazel(dot)de) wrote:
> On 2021-10-16 10:16:25 -0400, Bruce Momjian wrote:
> > As a final comment to Andres's email, adding a GCM has the problems
> > above, plus it wouldn't detect changes to pg_xact, fsm, vm, etc, which
> > could also affect the integrity of the data. Someone could also restore
> > and old copy of a patch to revert a change, and that would not be
> > detected even by GCM.
>
> > I consider this a checkbox feature and making it too complex will cause
> > it to be rightly rejected.
>
> You're just deferring / hiding the complexity. For one, we'll need integrity
> before long if we add encryption support. Then we'll deal with a more complex
> on-disk format because there will be two different ways of encrypting. For
> another, you're spreading out the security analysis to a lot of places in the
> code and more importantly to future changes affecting on-disk data.

I don't follow this argument. The XTS approach is explicitly the same
on-disk format as what we have unencrypted today, just encrypted, and
that's the whole point of going with that approach. If we were to
implement AES-GCM-SIV, then that would introduce a new on-disk format
and then we'd have two- one which has space on each page for the
authentication information, and one which doesn't.

> If it's really just a checkbox feature without a real use case, then we should
> just reject requests for it and use our energy for useful things.

This capability certainly has a real use-case and it's one that a lot of
organizations are looking for PG to provide a solution for. That we
don't today is keeping those organizations from using PG in at least
some cases, and for some organizations, it prevents them from using PG
at all, as they understandably would rather not deal with a hybrid of
using PG for some things and having to use another solution for other
things.

* Tomas Vondra (tomas(dot)vondra(at)enterprisedb(dot)com) wrote:
> On 10/16/21 16:16, Bruce Momjian wrote:
> >Storing the nonce in the page means more code complexity, possible
> >performance impact, and the inability to create standbys via binary
> >replication that use cluster file encryption.
>
> Would it really be that complex? Reserving a bunch of bytes at the end of
> each encrypted page (a bit like the "special" space, but after encryption)
> seems fairly straightforward. And I don't quite see why would this have a
> measurable impact, given the nonce is 16B at most. The encryption is likely
> way more expensive.

There's a patch to do exactly this- make space available towards the end
of the page. If we go down the route of using a different page format
then we lose the ability to do physical replication between an
unencrypted cluster and an encrypted one. That's certainly a nice
capability to have and it will help people migrate to an encrypted PG
instance, plus it's overall simpler to work with, which is also an
advantge.

> Moreover, it seems fairly reasonable to trade a bit of code complexity for
> something LSN-based which seems simpler but apparently has various weak
> points and is much harder to reason about.

This isn't just about code complexity but is also about the resulting
capabilities from these different approaches.

* Tomas Vondra (tomas(dot)vondra(at)enterprisedb(dot)com) wrote:
> On 10/16/21 18:28, Bruce Momjian wrote:
> >On Sat, Oct 16, 2021 at 09:15:05AM -0700, Andres Freund wrote:
> >>On 2021-10-16 10:16:25 -0400, Bruce Momjian wrote:
> >>>As a final comment to Andres's email, adding a GCM has the problems
> >>>above, plus it wouldn't detect changes to pg_xact, fsm, vm, etc, which
> >>>could also affect the integrity of the data. Someone could also restore
> >>>and old copy of a patch to revert a change, and that would not be
> >>>detected even by GCM.
> >>
> >>>I consider this a checkbox feature and making it too complex will cause
> >>>it to be rightly rejected.
> >>
> >>You're just deferring / hiding the complexity. For one, we'll need integrity
> >>before long if we add encryption support. Then we'll deal with a more complex
> >>on-disk format because there will be two different ways of encrypting. For
> >>another, you're spreading out the security analysis to a lot of places in the
> >>code and more importantly to future changes affecting on-disk data.
>
> I've argued for storing the nonce, but I don't quite see why would we need
> integrity guarantees?
>
> AFAICS the threat model the patch aims to address is an attacker who can
> observe the data (e.g. a low-privileged OS user), but can't modify the
> files. Which seems like a reasonable model for shared environments.

There are multiple threat models which we should be considering and
that's why we may want to eventually add integrity.

> IMO extending this to cases where the attacker can modify the data moves the
> goalposts quite significantly. And it's quite possible authenticated
> encryption would not be enough to prevent that, because that still works
> only at block level, and you can probably do a lot of harm with replay
> attacks (e.g. replacing blocks with older versions). And if you can modify
> the data directory / config files, what are the chances you can't just get
> access to the database, trace the processes or whatever?

I agree that working towards an authenticated solution is a larger task.
I don't agree that we should throw out the possibility that we may want
to implement it eventually as there are certainly threat models where an
attacker might have access to the storage but not to the database or the
system on which the database is running. Implementing a system to
address such an attack vector would take more consideration than just
having authenticated encryption provided by PG, but it certainly
couldn't be done without that either.

> We already have a way to check integrity by storing page checksum, but I'm
> not sure if that's good enough (there's a lot of subtle issues with building
> proper AEAD scheme).

No, it isn't good enough.

* Sasasu (i(at)sasa(dot)su) wrote:
> Just a mention. the HMAC (or AE/AD) can be disabled in AES-GCM. HMAC in
> AES-GCM is an encrypt-then-hash MAC.

Not sure why you would though.

> CRC-32 is not a crypto-safe hash (technically CRC-32 is not a hash
> function). Cryptographers may unhappy with CRC-32.

Yes, that's correct (and it isn't even CRC-32 that we have, heh).

> I think CRC or SHA is not such important. If IV can be stored, I believe
> there should have enough space to store HMAC.

This would be the case, yes. If we can find a way to make room for an
IV then we could make room to store the tag too, and we certainly should
(and we should include a variety of additional data in the AEAD- block
number, relfileno, etc).

* Sasasu (i(at)sasa(dot)su) wrote:
> On 2021/10/16 04:57, Tomas Vondra wrote:
> > Seems reasonable, on the assumption the threat models are the same.
>
> On 2021/10/16 03:22, Stephen Frost wrote:
> >plain64: the initial vector is the 64-bit little-endian version of the
> >sector number, padded with zeros if necessary
> >
> >That is, the default for LUKS is AES, XTS, with a simple IV. That
> >strikes me as a pretty ringing endorsement
> On 2021/10/18 05:23, Tomas Vondra wrote:
> > AFAICS the threat model the patch aims to address is an attacker who can
> > observe the data (e.g. a low-privileged OS user), but can't modify the
> > files. Which seems like a reasonable model for shared environments.
>
> I agree this threat model.
>
> And if PostgreSQL is using XTS, there is no different with dm-encrypt.
> The user can use dm-encrypt directly.

dm-encrypt is not always an option and it doesn't actually address the
threat-model that Tomas brought up here anyway, as it would be below the
level that the low-privileged OS user would be looking at. That's not
the only threat model to consider, but it is one which could potentially
be addressed by either XTS or AES-GCM-SIV. There are threat models
which dm-crypt would address, of course, such as data-at-rest (hard
drive theft, improper disposal of storage media, backups which don't
have their own encryption, etc), but, again, dm-crypt isn't always an
option that is available and so I don't agree that we should throw this
out just because dm-crypt exists and may be useable in some cases.

Thanks,

Stephen

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Gilles Darold 2021-10-18 16:30:07 Re: [PATCH] Proposal for HIDDEN/INVISIBLE column
Previous Message Robert Haas 2021-10-18 15:45:20 Re: when the startup process doesn't (logging startup delays)