Re: WIP: Data at rest encryption

From: Ants Aasma <ants(dot)aasma(at)gmail(dot)com>
To: Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP: Data at rest encryption
Date: 2016-06-12 07:13:23
Message-ID: CA+CSw_u1TPjWA66CvmAHgkSxQmpVHXOQNwXXMMBQQDp+FCxK2A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jun 10, 2016 at 5:23 AM, Haribabu Kommi
<kommi(dot)haribabu(at)gmail(dot)com> wrote:
> 1. Instead of doing the entire database files encryption, how about
> providing user an option to protect only some particular tables that
> wants the encryption at table/tablespace level. This not only provides
> an option to the user, it reduces the performance impact on tables
> that doesn't need any encryption. The problem with this approach
> is that every xlog record needs to validate to handle the encryption
> /decryption, instead of at page level.

Is there a real need for this? The customers I have talked to want to
encrypt the whole database and my goal is to make the feature fast
enough to make that feasible for pretty much everyone. I guess
switching encryption off per table would be feasible, but the key
setup would still need to be done at server startup. Per record
encryption would result in some additional information leakage though.
Overall I thought it would not be worth it, but I'm willing to have my
mind changed on this.

> 2. Instead of depending on a contrib module for the encryption, how
> about integrating pgcrypto contrib in to the core and add that as a
> default encryption method. And also provide an option to the user
> to use a different encryption methods if needs.

Technically that would be simple enough, this is more of a policy
decision. I think having builtin encryption provided by pgcrypto is
completely fine. If a consensus emerges that it needs to be
integrated, it would need to be a separate patch anyway.

> 3. Currently entire xlog pages are encrypted and stored in the file.
> can pg_xlogdump works with those files?

Technically yes, with the patch as it stands, no. Added this to my todo list.

> 4. For logical decoding, how about the adding a decoding behavior
> based on the module to decide whether data to be encrypted/decrypted.

The data to be encrypted does not depend on the module used, so I
don't think it should be module controlled. The reorder buffer
contains pretty much the same stuff as the xlog, so not encrypting it
does not look like a valid choice. For logical heap rewrites it could
be argued that nothing useful is leaked in most cases, but encrypting
it is not hard. Just a small matter of programming.

> 5. Instead of providing passphrase through environmental variable,
> better to provide some options to pg_ctl etc.

That looks like it would be worse from a security perspective.
Integrating a passphrase prompt would be an option, but a way for
scripts to provide passphrases would still be needed.

> 6. I don't have any idea whether is it possible to integrate the checksum
> and encryption in a single shot to avoid performance penalty.

Currently no, the checksum gets stored in the page header and for any
decent cipher mode the encryption of the rest of the page will depend
on it. However, the performance difference should be negligible
because both algorithms are compute bound for cached data. The data is
very likely to be completely in L1 cache as the operations are done in
quick succession.

The non-cryptographic checksum algorithm could actually be an attack
vector for an adversary that can trigger repeated encryption by
tweaking a couple of bytes at the end of the page to see when the
checksum matches and try to infer the data from that. Similarly to the
CRIME attack. However the LSN stored at the beginning of the page
header basically provides a nonce that makes this impossible.

This also means that encryption needs to imply wal_log_hints. Will
include this in the next version of the patch.

>> I would also like to incorporate some database identifier as a salt in
>> key setup. However, system identifier stored in control file doesn't
>> fit this role well. It gets initialized somewhat too late in the
>> bootstrap process, and more importantly, gets changed on pg_upgrade.
>> This will make link mode upgrades impossible, which seems like a no
>> go. I'm torn whether to add a new value for this purpose (perhaps
>> stored outside the control file) or allow setting of system identifier
>> via initdb. The first seems like a better idea, the file could double
>> as a place to store additional encryption parameters, like key length
>> or different cipher primitive.
>
> I feel separate file is better to include the key data instead of pg_control
> file.

I guess that would be more flexible. However I think at least the fact
that the database is encrypted should remain in the control file to
provide useful error messages for faulty backup procedures.

Thanks for your input.

Regards,
Ants Aasma

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2016-06-12 07:34:01 Re: Confusing recovery message when target not hit
Previous Message Eduardo Morras 2016-06-12 07:12:02 Re: [HACKERS] Online DW