Re: Transparent Data Encryption (TDE) and encrypted files

From: "Moon, Insung" <tsukiwamoon(dot)pgsql(at)gmail(dot)com>
To: Antonin Houska <ah(at)cybertec(dot)at>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Transparent Data Encryption (TDE) and encrypted files
Date: 2019-10-09 06:20:35
Message-ID: CAEMmqBuZ+VUouiEr2PLUsyWrvw8N6zeZ_1Aw3hgEKuGR8HYZcw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Dear Antonin Houska.
Thank you for your attention to thie matter.

On Wed, Oct 9, 2019 at 2:42 PM Antonin Houska <ah(at)cybertec(dot)at> wrote:
>
> Moon, Insung <tsukiwamoon(dot)pgsql(at)gmail(dot)com> wrote:
>
> > Hello.
> >
> > On Tue, Oct 8, 2019 at 8:52 PM Antonin Houska <ah(at)cybertec(dot)at> wrote:
> > >
> > > Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> > >
> > > > On Mon, Oct 7, 2019 at 3:01 PM Antonin Houska <ah(at)cybertec(dot)at> wrote:
> > > > > However the design doesn't seem to be stable enough at the
> > > > > moment for coding to make sense.
> > > >
> > > > Well, I think the question is whether working further on your patch
> > > > could produce some things that everyone would agree are a step
> > > > forward.
> > >
> > > It would have made a lot of sense several months ago (Masahiko Sawada actually
> > > used parts of our patch in the previous version of his patch (see [1]), but
> > > the requirement to use a different IV for each execution of the encryption
> > > changes things quite a bit.
> > >
> > > Besides the relation pages and SLRU (CLOG), which are already being discussed
> > > elsewhere in the thread, let's consider other two file types:
> > >
> > > * Temporary files (buffile.c): we derive the IV from PID of the process that
> > > created the file + segment number + block within the segment. This
> > > information does not change if you need to write the same block again. If
> > > new IV should be used for each encryption run, we can simply introduce an
> > > in-memory counter that generates the IV for each block. However it becomes
> > > trickier if the temporary file is shared by multiple backends. I think it
> > > might still be easier to expose the IV values to other backends via shared
> > > memory than to store them on disk ...
> >
> > I think encrypt a temporary file in a slightly different way.
> > Previously, I had a lot of trouble with IV uniqueness, but I have
> > proposed a unique encryption key for each file.
> >
> > First, in the case of the CTR mode to be used, 32 bits are used for
> > the counter in the 128-bit nonce value.
> > Here, the counter increases every time 16 bytes are encrypted, and
> > theoretically, if nonce 96 bits are the same, a total of 64 GiB can be
> > encrypted.
>
> > Therefore, in the case of buffile.c that creates a temporary file due
> > to lack of work_mem, it is possible to use up to 1GiB per file, so it
> > is possible to encrypt to a simple IV value sufficiently safely.
> > The problem is that a vulnerability occurs when 96-bit nonce values
> > excluding Counter are the same values.
>
> I don't think the lower 32 bits impose any limitation, see
> CRYPTO_ctr128_encrypt_ctr32() in OpenSSL: if this lower part overflows, the
> upper part is simply incremented. So it's up to the user to decide what
> portion of the IV he wants to control and what portion should be controlled by
> OpenSSL internally. Of course the application design should be such that no
> overflows into the upper (user specific) part occur because those would result
> in duplicate IVs.

I'm sorry. I seem to have misunderstood.
If I rechecked the source code of OpenSSL, as you said, it is assumed
that the upper 96bit value is changed using the ctr96_inc() function.
Sorry..

>
> > I also tried to generate IV using PID (32bit) + tempCounter (64bit) at
> > first, but in the worst-case PID and tempCounter are used in the same
> > values.
> > Therefore, the uniqueness of the encryption key was considered without
> > considering the uniqueness of the IV value.
>
> If you consider 64bit counter insufficient (here it seems that tempCounter
> counts the 1GB segments), then we can't even use LSN as the IV for relation
> pages.

The worst-case here is not a lack of tempCounter, but a problem that
occurs when PID is reused after a certain period.
Of course, it is very unlikely to be a problem because it is a
temporary file, but since the file name can know the PID and
tempFileCounter, if you accumulate some data, the same key and the
same IV will be used to encrypt other data. So I thought there could
be a problem.

>
> > The encryption key uses a separate key for each file, as described earlier.
>
> Do you mean a separate key for the whole temporary file, or for a single (1GB)
> segment?

Yes, that's right. Use a separate key per file.

>
> > First, it generates a hash value randomly for the file, and uses the
> > hash value and KEK (or MDEK) to derive and use the key with
> > HMAC-SHA256.
> > In this case, there is no need to store the encryption key separately
> > if it is not necessary to keep it in a separate IV file or memory.
> > (IV is a hash value of 64 bits and a counter of 32 bits.)
>
> You seem to miss the fact that user of buffile.c can seek in the file and
> rewrite arbitrary part. Thus you'd have to generate a new key for the part
> being changed.

That's right. I wanted to ask this too.
Is it possible to overwrite the data already written in the actual buffile.c?
Such a problem seems to become a problem when BufFileWRite function is
called, and BufFileSeek function is called, and BufFileRead is called.
In other words, the file is not written in units of 8kb, but the file
is changed in the pos, and some data is read in another pos.
I also thought that this would be a problem with re-creating the
encrypted file, i.e., IV and key change would be necessary,
So far, my research has found no case of overwriting data in the
previous pos after it has already been created in File data (where
FilWrite is called).
Can you tell me the case overwriting buffer file?Sorry..

>
> I think it's easier to use the same key for the whole 1GB segment if not for
> the whole temporary file, and generate an unique IV each time we write a chung
> (BLCKSZ bytes).

Yes. I think there will probably be a discussion about how to use
enc-key and IV to use.
I hope to find the safest way through various discussions.

Best regards.
Moon.

>
> --
> Antonin Houska
> Web: https://www.cybertec-postgresql.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2019-10-09 06:26:40 Safeguards against incorrect fd flags for fsync()
Previous Message Fujii Masao 2019-10-09 06:16:11 Re: Standby accepts recovery_target_timeline setting?