Re: XTS cipher mode for cluster file encryption

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Sasasu <i(at)sasa(dot)su>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Bruce Momjian <bruce(at)momjian(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: XTS cipher mode for cluster file encryption
Date: 2021-10-21 17:28:12
Message-ID: 20211021172812.GZ20998@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Greetings,

* Sasasu (i(at)sasa(dot)su) wrote:
> On 2021/10/20 20:24, Stephen Frost wrote:
> > PG does have a block-based IO API, it's just not exposed as hooks. In
> > particular, take a look at md.c, though perhaps you'd be more interested
> > in the higher level bufmgr.c routines. For the specific places where
> > encryption may hook in, looking at the DataChecksumsEnabled() call sites
> > may be informative (there aren't that many of them).
>
> md.c is great, easy to understand. but PG does not have a unified API. There
> has many unexpected pread()/pwrite() in many corners. md.c only for heap
> table, bufmgr.c only for a buffered heap table.

There's certainly other calls out there, yes.

> eg: XLogWrite() looks like a block API, but is a range write. equivalent to
> the append(2)

XLog is certainly another thing that has to be dealt with, of course,
but I don't see us trying to shoehorn that into using md.c somehow.

> eg: ALTER DATABASE SET TABLESPACE , the movedb() call. use copy_file() on
> heap table. which is just pread() pwrite() with 8*BLCKSZ.
> eg: all front-end tools use pread() to read heap table. in particular,
> pg_rewind write heap table by offset.
> eg: in contrib, pg_standby use system("cp") to copy WAL.

None of these are actually working with or changing the data though,
they're just copying it. I don't think we'd actually want these to
decrypt and reencrypt the data.

> On 2021/10/20 20:24, Stephen Frost wrote:
> > Breaking our ondisk format explicitly means that pg_upgrade won't work
> > any longer and folks won't be able to do in-place upgrades. That's a
> > pretty huge deal and it's something we've not done in over a decade.
> > I doubt that's going to fly.
>
> I completely agree.

Great.

> On 2021/10/20 20:24, Stephen Frost wrote:
> > Yes, using another fork for this is something that's been considered but
> > it's not without its own drawbacks, in particular having to do another
> > write and later fsync when a page changes.
> >
> > Further, none of this is necessary for XTS, but only for GCM. This is
> > why it was put forward that GCM involves a lot more changes to the
> > system and means that we won't be able to do things like binary
> > replication to switch from an unencrypted to encrypted cluster. Those
> > are good reasons to consider an XTS implementation first and then later,
> > perhaps, implement GCM.
>
> same as Robert Haas. I wish PG can do some infrastructure first. add more
> abstract layers like md.c (maybe a block-based API with ondisk format
> version field). so people can dive in without understanding the things which
> isolated by the abstract layer.

I really don't think this is necessary. Similar to PageSetChecksumCopy
and PageSetChecksumInplace, I'm sure we would have functions which are
called in the appropriate spots to do encryption (such as 'encrypt_page'
and 'encrypt_block' in the Cybertec patch) and folks could review those
in relative isolation to the rest. Dealing with blocks in PG is already
pretty well handled, the infrastructure that needs to be added is around
handling temporary files and is being actively worked on ... if we could
move past this debate around if we should be adding support for XTS or
if only GCM-SIV would be accepted.

> On 2021/10/20 20:24, Stephen Frost wrote:
> > What's the point of using GCM if we aren't going to actually verify the
> > tag? Also, the Cybertec patch didn't add an extra reserved field to the
> > page format, and it used CTR anyway..
>
> Oh, I am wrong, Cybertec patch can not use XTS, because WAL may not be
> aligned to 16bytes. for WAL need a stream cipher. The CTR implement is still
> correct.

No, the CTR approach isn't great because, as has been discussed quite a
bit already, using the LSN as the IV means that different data might be
re-encrypted with the same LSN and that's not an acceptable thing to
have happen with CTR.

> CTR with hash(offset) as IV is basically equal to XTS. if use another AES
> key to encrypt the hash(offset), and block size is 16bytes it is XTS.

I don't understand why we're talking about CTR+other-stuff. Maybe if
you use CTR and then do other things then it's equivilant in some
fashion to XTS ... but then it's not CTR anymore and we shouldn't be
calling it that. Saying that we should do "CTR+other stuff" (which
happens to make it equivilant to XTS) instead of just saying we should
use "XTS" is very confusing and further means that we're starting down
the path of trying to come up with our own hack on existing encryption
schemes, and regardless of who shows up on this list to claim that doing
so makes sense and is "right", I'm going to be extremely skeptical.

> The point is that can not save random IV for WAL without adding a reserved
> field, no matter use GCM or CTR.

Yes, it's correct that we can't use a random IV for the WAL without
figuring out how to keep track of that random IV. Thankfully, for WAL
(unlike heap and index blocks) we don't really have that issue- we
hopefully aren't going to write different WAL records at the same LSN
and so using the LSN there should be alright. There's some odd edge
cases around this too (split-brain situations in particular where one of
the systems does crash recovery and doesn't actually promote and
therefore stays on the same timeline), but that kind of thing ends up
breaking other things too (WAL archiving, as an example) and isn't
really something we can support. Further, I figure we'll tell people to
use different keys for different systems anyway to avoid this risk.

> Because WAL only does append to the end, using CTR for WAL is safer than
> using XTS for heap table. If you want more security for WAL encryption, add
> HKDF[1].

I don't follow how you're making these comparisons as to what is "safer"
for which when talking about two different encryption methodologies and
two very different systems (the heap vs. WAL) or if you're actually
suggesting something here.

We've discussed at length how using CTR for heap isn't a good idea even
if we're using the LSN for the IV, while if we use XTS then we don't
have the issues that CTR has with IV re-use and using the LSN (plus
block number and perhaps other things). Nothing in what has been
discussed here has really changed anything there that I can see and so
it's unclear to me why we continue to go round and round with it.

Thanks,

Stephen

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2021-10-21 19:29:26 Re: ThisTimeLineID can be used uninitialized
Previous Message Fujii Masao 2021-10-21 17:14:40 Re: pgstat_assert_is_up() can fail in walsender