Re: WIP: Data at rest encryption

From: Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP: Data at rest encryption
Date: 2017-06-16 08:06:39
Message-ID: 1e0a11ad-1255-63ff-a11e-1365003c44af@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 16.06.2017 03:08, Bruce Momjian wrote:
>
> Yeah, I guess we will just have to wait to see it since other people are
> excited about it. My concern is code complexity and usability
> challenges, vs punting the problem to the operating system, though
> admittedly there are some cases where that is not possible.
>

Let me also share my opinion about encryption and compression support at
database level.
PostgresPro Enterprise does support both. I have made presentation about
it at PgConn 2016 in Tallinn.
I was a little bit surprised that there were more questions about
encryption than about compression.
But right now we have several customers which are using compression and
none of them use encryption (just because them do not need
to protect their databases). But I absolutely sure that there are many
Postgres users which first of all need to protect their data.

Encryption is much easier to implement than compression, because it is
not changing page size. So I do not see any "complexity and flexibility
challenges" here.
Just for reference I attached to this mail our own encryption patch. I
do not want to propose it as alternative to Aasmas patch: it is less
flexible and doesn't support encryption of WAL, just encryption of
relation data. Also it doesn't allow custom encryption libraries: AES
implementation is embedded. Encryption cipher is taken from environment
variable. At Tallin's conferences I was informed about possible security
issue with passing key through environment variable: it is possible to
inspect server's environment variables using plpython/plperl stored
procedure.
This is why we unset this environment variable after reading. I am not
expect in security, but I do not know other issues with such solution.

Concerning the question whether to implement compression/encryption on
database level or rely on OS, my opinion is that there are many
scenarios where it is not possible or is not desirable to use OS level
encryption/protection. It first of all includes cloud installations and
embedded applications. I do not want to repeat arguments already
mentioned in this thread.
But the fact is that there are many people which really need
compression/encryption support and them can not or do not want to
redirect this aspects to OS. Almost all DBMSes are supporting
compression encryption, so lack of this features in Postgres definitely
can not be considered as Postgres advantage.

Postgres buffer manager interface significantly simplifies integration
of encryption and compression. There is actually single path through
which data is fetched/stored to the disk.
It is most obvious and natural solution to decompress/decrypt data when
it is read from the disk to page pool and compress/encrypt it when it is
written back. Taken in account that memory is cheap now and many
databases can completely fit in memory, storing pages in the buffer
cache in plain (decompressed/decrypted) format allows to minimize
overhead of compression/encryption and its influence on performance. For
read only queries working with cached data performance will be exactly
the same as without encryption/compression.
Write speed for encrypted pages will be certainly slightly worse, but
still encryption speed is much higher than disk IO speed.

So I do not think that it is really necessary to support encryption of
some particular tables, storing "non-secrete" data in plain format
without encryption. It should not cause noticeable improve of
performance, but may complicate implementation and increase possibility
of leaking secure data.

I do not think that pluggable storage API is right approach to integrate
compression and especially encryption. It is better to plugin encryption
between buffer manager and storage device,
allowing to use it with any storage implementation. Also it is not
clear to me whether encryption of WAL can be provided using pluggable
storage API.

The last discussed question is whether it is necessary to encrypt
temporary data (BufFile). In our solution we encrypt only main fork of
non-system relations and do no encrypt temporary relations. It may cause
that some secrete data will be stored at this disk in non-encrypted
format. But accessing this data is not trivial. You can not just
copy/stole disk, open database and do "select * from SecreteTable": you
will have to extract data from raw file yourself. So looks like it is
better to allow user to make choice whether to encrypt temporary data or
not.

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment Content-Type Size
aes.patch text/x-patch 80.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro HORIGUCHI 2017-06-16 08:16:08 Re: Adding support for Default partition in partitioning
Previous Message Rushabh Lathia 2017-06-16 07:49:38 RLS policy not getting honer while pg_dump on declarative partition