Re: Transparent Data Encryption (TDE) and encrypted files

From: Antonin Houska <ah(at)cybertec(dot)at>
To: "Moon, Insung" <tsukiwamoon(dot)pgsql(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Transparent Data Encryption (TDE) and encrypted files
Date: 2019-10-09 05:42:56
Message-ID: 77265.1570599776@antos
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Moon, Insung <tsukiwamoon(dot)pgsql(at)gmail(dot)com> wrote:

> Hello.
>
> On Tue, Oct 8, 2019 at 8:52 PM Antonin Houska <ah(at)cybertec(dot)at> wrote:
> >
> > Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> >
> > > On Mon, Oct 7, 2019 at 3:01 PM Antonin Houska <ah(at)cybertec(dot)at> wrote:
> > > > However the design doesn't seem to be stable enough at the
> > > > moment for coding to make sense.
> > >
> > > Well, I think the question is whether working further on your patch
> > > could produce some things that everyone would agree are a step
> > > forward.
> >
> > It would have made a lot of sense several months ago (Masahiko Sawada actually
> > used parts of our patch in the previous version of his patch (see [1]), but
> > the requirement to use a different IV for each execution of the encryption
> > changes things quite a bit.
> >
> > Besides the relation pages and SLRU (CLOG), which are already being discussed
> > elsewhere in the thread, let's consider other two file types:
> >
> > * Temporary files (buffile.c): we derive the IV from PID of the process that
> > created the file + segment number + block within the segment. This
> > information does not change if you need to write the same block again. If
> > new IV should be used for each encryption run, we can simply introduce an
> > in-memory counter that generates the IV for each block. However it becomes
> > trickier if the temporary file is shared by multiple backends. I think it
> > might still be easier to expose the IV values to other backends via shared
> > memory than to store them on disk ...
>
> I think encrypt a temporary file in a slightly different way.
> Previously, I had a lot of trouble with IV uniqueness, but I have
> proposed a unique encryption key for each file.
>
> First, in the case of the CTR mode to be used, 32 bits are used for
> the counter in the 128-bit nonce value.
> Here, the counter increases every time 16 bytes are encrypted, and
> theoretically, if nonce 96 bits are the same, a total of 64 GiB can be
> encrypted.

> Therefore, in the case of buffile.c that creates a temporary file due
> to lack of work_mem, it is possible to use up to 1GiB per file, so it
> is possible to encrypt to a simple IV value sufficiently safely.
> The problem is that a vulnerability occurs when 96-bit nonce values
> excluding Counter are the same values.

I don't think the lower 32 bits impose any limitation, see
CRYPTO_ctr128_encrypt_ctr32() in OpenSSL: if this lower part overflows, the
upper part is simply incremented. So it's up to the user to decide what
portion of the IV he wants to control and what portion should be controlled by
OpenSSL internally. Of course the application design should be such that no
overflows into the upper (user specific) part occur because those would result
in duplicate IVs.

> I also tried to generate IV using PID (32bit) + tempCounter (64bit) at
> first, but in the worst-case PID and tempCounter are used in the same
> values.
> Therefore, the uniqueness of the encryption key was considered without
> considering the uniqueness of the IV value.

If you consider 64bit counter insufficient (here it seems that tempCounter
counts the 1GB segments), then we can't even use LSN as the IV for relation
pages.

> The encryption key uses a separate key for each file, as described earlier.

Do you mean a separate key for the whole temporary file, or for a single (1GB)
segment?

> First, it generates a hash value randomly for the file, and uses the
> hash value and KEK (or MDEK) to derive and use the key with
> HMAC-SHA256.
> In this case, there is no need to store the encryption key separately
> if it is not necessary to keep it in a separate IV file or memory.
> (IV is a hash value of 64 bits and a counter of 32 bits.)

You seem to miss the fact that user of buffile.c can seek in the file and
rewrite arbitrary part. Thus you'd have to generate a new key for the part
being changed.

I think it's easier to use the same key for the whole 1GB segment if not for
the whole temporary file, and generate an unique IV each time we write a chung
(BLCKSZ bytes).

--
Antonin Houska
Web: https://www.cybertec-postgresql.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2019-10-09 05:48:50 First WAL segment file that initdb creates
Previous Message Moon, Insung 2019-10-09 05:34:24 Re: Transparent Data Encryption (TDE) and encrypted files