RE: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)

From: "Moon, Insung" <Moon_Insung_i3(at)lab(dot)ntt(dot)co(dot)jp>
To: "'Masahiko Sawada'" <sawada(dot)mshk(at)gmail(dot)com>
Cc: "'PostgreSQL-development'" <pgsql-hackers(at)postgresql(dot)org>
Subject: RE: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)
Date: 2018-07-03 11:21:38
Message-ID: 006501d412c0$026e9790$074bc6b0$@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Dear Masahiko Sawada.

> -----Original Message-----
> From: Masahiko Sawada [mailto:sawada(dot)mshk(at)gmail(dot)com]
> Sent: Monday, June 11, 2018 6:22 PM
> To: Moon, Insung
> Cc: PostgreSQL-development; Joe Conway
> Subject: Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)
>
> On Fri, May 25, 2018 at 8:41 PM, Moon, Insung <Moon_Insung_i3(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> > Hello Hackers,
> >
> > This propose a way to develop "Table-level" Transparent Data
> > Encryption (TDE) and Key Management Service (KMS) support in PostgreSQL.
> >
> >
> > Issues on data encryption of PostgreSQL ========== Currently, in
> > PostgreSQL, data encryption can be using pgcrypto Tool.
> > However, it is inconvenient to use pgcrypto to encrypts data in some cases.
> >
> > There are two significant inconveniences.
> >
> > First, if we use pgcrypto to encrypt/decrypt data, we must call pgcrypto functions everywhere we encrypt/decrypt.
> > Second, we must modify application program code much if we want to do
> > database migration to PostgreSQL from other databases that is using TDE.
> >
> > To resolved these inconveniences, many users want to support TDE.
> > There have also been a few proposals, comments, and questions to support TDE in the PostgreSQL community.
> >
> > However, currently PostgreSQL does not support TDE, so in development
> > community, there are discussions whether it's necessary to support TDE or not.
> >
> > In these discussions, there were requirements necessary to support TDE in PostgreSQL.
> >
> > 1) The performance overhead of encryption and decryption database data
> > must be minimized
> > 2) Need to support WAL encryption.
> > 3) Need to support Key Management Service.
> >
> > Therefore, I'd like to propose the new design of TDE that deals with both above requirements.
> > Since this feature will become very large, I'd like to hear opinions from community before starting making the patch.
> >
> > First, my proposal is table-level TDE which is that user can specify tables begin encrypted.
> > Indexes, TOAST table and WAL associated with the table that enables TDE are also encrypted.
> >
> > Moreover, I want to support encryption for large object as well.
> > But I haven't found a good way for it so far. So I'd like to remain it as future TODO.
> >
> > My proposal has five characteristics features of "table-level TDE".
> >
> > 1) Buffer-level data encryption and decryption
> > 2) Per-table encryption
> > 3) 2-tier encryption key management
> > 4) Working with external key management services(KMS)
> > 5) WAL encryption
> >
> > Here are more details for each items.
> >
> >
> > 1. Buffer-level data encryption and decryption ==================
> > Transparent data encryption and decryption accompany by storage
> > operation With ordinally way like using pgcrypto, the biggest problem
> > with encrypted data is the performance overhead of decrypting the data each time the run to queries.
> >
> > My proposal is to encrypt and decrypt data when performing DISK I/O operation to minimize performance overhead.
> > Therefore, the data in the shared memory layer is unencrypted so that performance overhead can minimize.
> >
> > With this design, data encryption/decryption implementations can be
> > developed by modifying the codes of the storage and buffer manager
> > modules, which are responsible for performing DISK I/O operation.
> >
> >
> > 2. Per-table encryption
> > ==================
> > User can enable TDE per table as they want.
> > I introduce new storage parameter "encryption_enabled" which enables TDE at table-level.
> >
> > // Generate the encryption table
> > CREATE TABLE foo WITH ( ENCRYPTION_ENABLED = ON );
> >
> > // Change to the non-encryption table
> > ALTER TABLE foo SET ( ENCRYPTION_ENABLED = OFF );
> >
> > This approach minimizes the overhead for tables that do not require encryption options.
> > For tables that enable TDE, the corresponding table key will be
> > generated with random values, and it's stored into the new system catalog after being encrypted by the master key.
> >
> > BTW, I want to support CBC mode encryption[3]. However, I'm not sure how to use the IV in CBC mode for this proposal.
> > I'd like to hear opinions by security engineer.
> >
> >
> > 3. 2-tier encryption key management
> > ==================
> > when it comes time to change cryptographic keys, there is a performance overhead to decryption and re-encryption to
> all data.
> >
> > To solve this problem we employee 2-tier encryption.
> > 2-tier encryption is All table keys can be stored in the database
> > cluster after being encrypted by the master key, And master keys must be stored at external of PostgreSQL.
> >
> > Therefore, without master key, it is impossible to decrypt the table key. Thus, It is impossible to decrypt the database
> data.
> >
> > When changing the key, it's not necessary to re-encrypt for all data.
> > We use the new master key only to decrypt and re-encrypt the table key, these operations for minimizing the performance
> overhead.
> >
> > For table keys, all TDE-enabled tables have different table keys.
> > And for master key, all database have different master keys. Table keys are encrypted by the master key of its own database.
> > For WAL encryption, we have another cryptographic key. WAL-key is also
> > encrypted by a master key, but it is shared across the database cluster.
> >
> >
> > 4. Working with external key management services(KMS)
> > ================== A key management service is an integrated approach
> > for generating, fetching and managing encryption keys for key control.
> > They may cover all aspects of security from the secure generation of
> > keys, secure storing keys, and secure fetching keys up to encryption key handling.
> > Also, various types of KMSs are provided by many companies, and users can choose them.
> >
> > Therefore I would like to manage the master key using KMS.
> > Also, my proposal is to create callback APIs(generate_key, fetch_key,
> > store_key) in the form of a plug-in so that users can use many types of KMS as they want.
> >
> > In KMIP protocol and most KMS manage keys by string IDs. We can get keys by key ID from KMS.
> > So in my proposal, all master keys are distinguished by its ID, called "master key ID".
> > The master key ID is made, for example, using the database oid and a
> > sequence number, like <OID>_<SeqNo>. And they are managed in PostgreSQL.
> >
> > When database startup, all master key ID is loaded to shared memory, and they are protected by LWLock.
> >
> > When it comes time to rotate the master keys, run this query.
> >
> > ALTER SYSTEM ROTATION MASTER KEY;
> >
> > In this query, the master key is rotated with the following step.
> > 1. Generate new master key,
> > 2. Change master key IDs and emit corresponding WAL 3. Re-encrypt all
> > table keys on its database
> >
> > Also during checkpoint, master key IDs on shared memory become a permanent condition.
> >
> >
> > 5. WAL encryption
> > ==================
> > If we encrypt all WAL records, performance overhead can be significant.
> > Therefore, this proposes a method to encrypt only WAL record excluding
> > WAL header when writing WAL on the WAL buffer, instead of encrypting a whole WAL record.
> > WAL encryption key is generated separately when the TDE-enabled table
> > is created the first time. We use 2-tier encryption for WAL encryption as well.
> > So, when it comes time to rotate the WAL encryption key, run this query.
> >
> > ALTER SYSTEM ROTATION WAL KEY;
> >
> > Next, I will explain how to encrypt WAL.
> >
> > To do this operation, I add a flag to WAL header which indicates whether the subsequent WAL data is encrypted or not.
> >
> > Then, when we write WAL for encryption table we write "encrypted" WAL on WAL buffer layer.
> >
> > In recovery, we read WAL header and check the flag of encryption, and judges whether WAL must be decrypted.
> > In the case of PITR, we use WAL key ID in the backup file.
> >
> > With this approach, the performance overhead of writing and reading
> > the WAL for unencrypted tables would be almost the same as before.
> >
> >
> > ==================
> > I'd like to discuss the design before starting making any change of code.
> > After a more discussion I want to make a PoC.
> > Feedback and suggestion are very welcome.
> >
> > Finally, thank you initial design input for Masahiko Sawada.
> >
> > Thank you.
> >
> > [1] What does TDE mean?
> > > https://en.wikipedia.org/wiki/Transparent_Data_Encryption
> >
> > [2] What does KMS mean?
> > >
> > https://en.wikipedia.org/wiki/Key_management#Key_Management_System
> >
> > [3] What does CBC-Mode mean?
> > > https://en.wikipedia.org/wiki/Block_cipher_mode_of_operation
> >
> > [4] Recently discussed mail
> >
> > https://www.postgresql.org/message-id/CA%2BCSw_tb3bk5i7if6inZFc3yyf%2B
> > 9HEVNTy51QFBoeUk7UE_V%3Dw%40mail.gmail.com
> >
> >
>
> As per discussion at PGCon unconference, I think that firstly we need to discuss what threats we want to defend database
> data against. If user wants to defend against a threat that is malicious user who logged in OS or database steals an important
> data on datbase this design TDE would not help. Because such user can steal the data by getting a memory dump or by SQL.
> That is of course differs depending on system requirements or security compliance but what threats do you want to defend
> database data against? and why?

Yes. I'm Checking to the requirement 3.4 of PCI-DSS.
This requirement is a refer to encrypting stored data.
And idea does not protect data against memory dump(include coredump).
If required for an encryption of memory layer, I'll recheck to this idea.
And I will do a little more research on any enterprise requirement on encryption data.

>
> Also, if I understand correctly, at unconference session there also were two suggestions about the design other than the
> suggestion by
> Alexander: implementing TDE at column level using POLICY, and implementing TDE at table-space level. The former was suggested
> by Joe but I'm not sure the detail of that suggestion. I'd love to hear the deal of that suggestion. The latter was suggested
> by Tsunakawa-san.
> Have you considered that?

First, thank you for Joe and Tsunakawa-san.
I'm thinking of table-level encrypting, but I'll try to find the best way through this discussion.

>
> You mentioned that encryption of temporary data for query processing and large objects are still under the consideration.
> But other than them you should consider the temporary data generated by other subsystems such as reorderbuffer and transition
> table as well.

Yes. Encryption of temporary data and large objects and anymore is considered essential.
In this case, I have not yet decided how to encrypt temporary data. I'll make PoC patch, and find to how to encryption of temporary data.

Thank you and Best regards.
Moon.

>
> Regards,
>
> --
> Masahiko Sawada
> NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Moon, Insung 2018-07-03 11:26:43 RE: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)
Previous Message Moon, Insung 2018-07-03 11:18:42 RE: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)