Re: XTS cipher mode for cluster file encryption

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Sasasu <i(at)sasa(dot)su>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Bruce Momjian <bruce(at)momjian(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: XTS cipher mode for cluster file encryption
Date: 2021-10-20 12:24:08
Message-ID: 20211020122407.GW20998@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Greetings,

* Sasasu (i(at)sasa(dot)su) wrote:
> But If PG has a clear block-based IO API, TDE is much easier to understand.

PG does have a block-based IO API, it's just not exposed as hooks. In
particular, take a look at md.c, though perhaps you'd be more interested
in the higher level bufmgr.c routines. For the specific places where
encryption may hook in, looking at the DataChecksumsEnabled() call sites
may be informative (there aren't that many of them).

> security people may lack database knowledge but they can understand block
> IO.
> This will allow more people to join PG community.

We'd certainly welcome them. I don't think we're going to try to
redesign our entire IO subsystem in the hopes that they'll show up
though.

> On 2021/10/20 02:54, Stephen Frost wrote:
> > Where would you store the tag for GCM without changes in core?
>
> If can add 32bit reserve field (in CGM is 28bits) will be best.

That's the idea that's been discussed, but the approach put forward is
to do it in a manner that allows the same binaries to work with a
TDE-enabled cluster and a non-TDE cluster which means two different
formats on disk. This is still a pretty big deal and would require
logical replication or pg_dump/restore to go from unencrypted to
encrypted.

> data file size will increase 0.048% (if BLCKSZ = 8KiB), I think it is
> acceptable even for the user who does not use TDE. but need ondisk format
> change.

Breaking our ondisk format explicitly means that pg_upgrade won't work
any longer and folks won't be able to do in-place upgrades. That's a
pretty huge deal and it's something we've not done in over a decade.
I doubt that's going to fly.

> If without of modify anything in core and doing GCM, the under-layer can
> write out a key fork, fsync(2) key fork with the same strategy for main
> fork. this is crash-safe. The consistency is ensured by WAL. (means
> wal_log_hints need set to on)
> Or the underlayer can re-struct the IO request. insert one IV block per
> 2730(=BLKSZ/IV_SIZE) data blocks. this method like the _mdfd_getseg() in
> md.c which split file by 1GiB. No perception in the upper layers.
> Both of them can use cache to reduce performance downgrade.

Yes, using another fork for this is something that's been considered but
it's not without its own drawbacks, in particular having to do another
write and later fsync when a page changes.

Further, none of this is necessary for XTS, but only for GCM. This is
why it was put forward that GCM involves a lot more changes to the
system and means that we won't be able to do things like binary
replication to switch from an unencrypted to encrypted cluster. Those
are good reasons to consider an XTS implementation first and then later,
perhaps, implement GCM.

> for WAL encryption, the CybertecDB implement is correct. we can not write
> any extra data without adding a reserved field in core. because can not
> guarantee consistency. If use GCM for WAL encryption must disable HMAC
> verification.

What's the point of using GCM if we aren't going to actually verify the
tag? Also, the Cybertec patch didn't add an extra reserved field to the
page format, and it used CTR anyway..

Thanks,

Stephen

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2021-10-20 12:40:04 Re: Postgres perl module namespace
Previous Message Alvaro Herrera 2021-10-20 12:19:51 Re: [PATCH] Fix memory corruption in pg_shdepend.c