Re: Moving forward with TDE [PATCH v3]

From: David Christensen <david(dot)christensen(at)crunchydata(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, vignesh C <vignesh21(at)gmail(dot)com>, Aleksander Alekseev <aleksander(at)timescale(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Moving forward with TDE [PATCH v3]
Date: 2023-11-08 23:01:38
Message-ID: CAOxo6XJY-F3mh=epBHLSyb9befM6B0PhgxCs5KO9X3C99YAQhQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Nov 7, 2023 at 5:49 PM Andres Freund <andres(at)anarazel(dot)de> wrote:

> Hi,
>
> On 2023-11-06 09:56:37 -0500, Stephen Frost wrote:
> > * Andres Freund (andres(at)anarazel(dot)de) wrote:
> > > I still am quite quite unconvinced that using the LSN as a nonce is a
> good
> > > design decision.
> >
> > This is a really important part of the overall path to moving this
> > forward, so I wanted to jump to it and have a specific discussion
> > around this. I agree that there are downsides to using the LSN, some of
> > which we could possibly address (eg: include the timeline ID in the IV),
> > but others that would be harder to deal with.
>
> > The question then is- what's the alternative?
> >
> > One approach would be to change the page format to include space for an
> > explicit nonce. I don't see the community accepting such a new page
> > format as the only format we support though as that would mean no
> > pg_upgrade support along with wasted space if TDE isn't being used.
>
> Right.
>

Hmm, if we /were/ to introduce some sort of page format change, Couldn't
that be a use case for modifying the pd_version field? Could v4 pages be
read in and written out as v5 pages with different interpretations?

> > Ideally, we'd instead be able to support multiple page formats where
> > users could decide when they create their cluster what features they
> > want- and luckily we even have such an effort underway with patches
> > posted for review [1].
>
> I think there are some details wrong with that patch - IMO the existing
> macros
> should just continue to work as-is and instead the places that want the
> more
> narrow definition should be moved to the new macros and it changes places
> that
> should continue to use compile time constants - but it doesn't seem like a
> fundamentally bad idea to me. I certainly like it much better than making
> the
> page size runtime configurable.
>

There had been some discussion about this WRT renaming macros and the like
(constants, etc)—I think a new pass eliminating the variable blocksize
pieces and seeing if we can minimize churn here is worthwhile, will take a
look and see what the minimally-viable set of changes is here.

> (I'll try to reply with the above points to [1])
>
>
> > Certainly, with the base page-special-feature patch, we could have an
> option
> > for users to choose that they want a better nonce than the LSN, or we
> could
> > bundle that assumption in with, say, the authenticated-encryption feature
> > (if you want authenticated encryption, then you want more from the
> > encryption system than the basics, and therefore we presume you also
> want a
> > better nonce than the LSN).
>
> I don't think we should support using the LSN as a nonce if we have an
> alternative. The cost and complexity overhead is just not worth it. Yes,
> it'll be harder for users to migrate to encryption, but adding complexity
> elsewhere in the system to get an inferior result isn't worth it.
>

From my read, XTS (which I'd see as inferior to authenticated encryption,
but better than some other options) could use LSN as an IV without leakage
concerns, perhaps mixing in the BlockNumber as well. If we are going to
allow multiple encryption types, I think we may need to consider that needs
for IVs may be different, so this may need to be something that is
selectable per encryption type.

I am unclear how much of a requirement this is, but seems like having a
design supporting this to be pluggable—even if a static lookup table
internally for encryption type, block length, IV source, etc—seems the most
future proof if we had to retire an encryption method or prevent creation
of specific methods, say.

> > Another approach would be a separate fork, but that then has a number of
> > downsides too- every write has to touch that too, and a single page of
> > nonces would cover a pretty large number of pages also.
>
> Yea, the costs of doing so is nontrivial. If you were trying to implement
> encryption on the smgr level - which I doubt you should but am not certain
> about - my suggestion would be to interleave pages with metadata like
> nonces
> and AEAD with the data pages. I.e. one metadata page would be followed by
> (BLCKSZ - SizeOfPageHeaderData) / (sizeof(nonce) + sizeof(AEAD))
> pages containing actual relation data. That way you still get decent
> locality
> during scans and writes.
>

Hmm, this is actually an interesting idea, I will think about this a bit.

> Relation forks were a mistake, we shouldn't use them in more places.
>
>
> I think it'd be much better if we also encrypted forks, rather than just
> the
> main fork...
>

I believe the existing code should just work by modifying
the PageNeedsToBeEncrypted macro; I will test that and see if anything
blows up.

David

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2023-11-08 23:05:20 Re: Remove MSVC scripts from the tree
Previous Message Michael Paquier 2023-11-08 22:56:47 Re: pg_upgrade and logical replication