Re: Internal key management system

From: Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Sehrope Sarkuni <sehrope(at)jackdb(dot)com>, cary huang <hcary328(at)gmail(dot)com>, "Moon, Insung" <tsukiwamoon(dot)pgsql(at)gmail(dot)com>, Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>, Joe Conway <mail(at)joeconway(dot)com>, Bruce Momjian <bruce(dot)momjian(at)enterprisedb(dot)com>
Subject: Re: Internal key management system
Date: 2020-02-19 15:44:27
Message-ID: CA+fd4k7tq8HQ8dyUFixS7kZAYHDNH-dmi0N2g+WCcHSNyt2g4Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, 15 Feb 2020 at 01:00, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> On Thu, Feb 6, 2020 at 9:19 PM Masahiko Sawada
> <masahiko(dot)sawada(at)2ndquadrant(dot)com> wrote:
> > This feature protects data from disk thefts but cannot protect data
> > from attackers who are able to access PostgreSQL server. In this
> > design application side still is responsible for managing the wrapped
> > secret in order to protect it from attackers. This is the same as when
> > we use pgcrypto now. The difference is that data is safe even if
> > attackers steal the wrapped secret and the disk. The data cannot be
> > decrypted either without the passphrase which can be stored to other
> > key management systems or without accessing postgres server. IOW for
> > example, attackers can get the data if they get the wrapped secret
> > managed by application side then run pg_kmgr_unwrap() to get the
> > secret and then steal the disk.
>
> If you only care about protecting against the theft of the disk, you
> might as well just encrypt the whole filesystem, which will probably
> perform better and probably be a lot harder to break since you won't
> have short encrypted strings but instead large encrypted blocks of
> data. Moreover, I think a lot of people who are interested in these
> kinds of features are hoping for more than just protecting against the
> theft of the disk. While some people may be hoping for too much in
> this area, setting your sights only on encryption at rest seems like a
> fairly low bar.

This feature also protects data from reading database files directly.
And it's also good that it's independent of platforms.

To be clear, let me summarize scenarios where we will be able to
protect data and won't. We can put the cluster key which will be
obtained by cluster_passphrase_command into another component in the
system, for instance into KMS ideally. The user key is wrapped and
saved to an application server or somewhere it can obtain promptly.
PostgreSQL server has the master key in the disk which is wrapped by
the cluster key along with the user data encrypted by the user key.
While running PostgreSQL server, user can unwrap the user key using by
pg_unwrap_key to get the user key. Given that attackers stole the
database disk that includes encrypted user data and the wrapped master
key, what they need to complete their attack is (1) the wrapped user
key and an access to PostgreSQL server, (2) the cluster key and the
wrapped user key or (3) the master key and the wrapped user key. They
cannot get user data with only one of those secrets: the cluster key,
the master key and the wrapped user key.

In case (1), PostgreSQL needs to be running and they need to be able
to access a PostgreSQL server, which may require a password, to
execute pg_unwrap_key with the wrapped user key they stole. In case
(2), since the wrapped user key is stored in the application server
and it will be likely to be accessible without special privilege it
may be easy for attackers to get it. However in addition, they need to
attack KMS to get the cluster key. Finally in case (3), again, they
may be able to steal the wrapped user key. But they need also to be
able to login to OS in an unauthorized way and then illegally see the
PostgreSQL shared buffer.

ISTM these all cases will be not easy for attackers.

>
> It also doesn't seem very likely to actually provide any security.
> You're talking about sending the encryption key in the query string,
> which means that there's a good chance it's going to end up in a log
> file someplace. One way that could happen is if the user has
> configured log_statement=all or log_min_duration_statement, but it
> could also happen any time the query throws an error. In theory, you
> might arrange for the log messages to be sent to another server that
> is protected by separate layers of security, but a lot of people are
> going to just log locally. And, even if you do have a separate server,
> do you really want to have the logfile over there be full of
> passwords? I know I can be awfully negative some times, but that it
> seems like a weakness so serious as to make this whole thing
> effectively useless.
>

Since the user key could be logged to server logs attackers will be
able to get user data by stealing only the database disk if the server
logs locally. But I personally think that it's not a serious problem
that will make this feature meaningless, depending on user cases. User
will be likely to have user key per users or one key for one instance.
So for example, in the case where the system doesn't add new users
during running, user can wrap the user key before the system starting
service and therefore user will need pay attention only at that time.
If user can take care of that we can accept such restriction.

> One way to plug this hole is to use new protocol messages for key
> exchanges. For example, suppose that after authentication is complete,
> you can send the server a new protocol message: KeyPassphrase
> <key-name> <passphrase>. The server stores the passphrase in
> backend-private memory and returns ReadyForQuery, and does not log the
> message payload anywhere. Now you do this:
>
> INSERT INTO tbl VALUES (pg_encrypt('user data', 'key-name');
> SELECT pg_decrypt(secret_column, 'key-name') FROM tbl;
>
> If the passphrase for the named key has not been loaded into the
> current session's memory, this produces an error; otherwise, it looks
> up the passphrase and uses it to do the decryption. Now the passphrase
> never gets logged anywhere, and, also, the user can't persuade the
> server to provide it with the encryption key, because there's no
> SQL-level function to access that data.
>
> We could take it a step further: suppose that encryption is a column
> property, and the value of the property is a key name. If the user
> hasn't sent a KeyPassphrase message with the relevant key name,
> attempts to access that column just error out. If they have, then the
> server just does the encryption and decryption automatically. Now the
> user can just do:
>
> INSERT INTO tbl VALUES ('user data');
> SELECT secret_column FROM tbl;
>
> It's a huge benefit if the SQL doesn't need to be changed. All that an
> application needs to do in order to use encryption in this scenario is
> use PQsetKeyPassphrase() or whatever before doing whatever else they
> want to do.

Your idea seems good. I think the point from development perspective
is whether it's worth to have such a dedicated feature in order to
provide the transparent encryption feature using pgcrypto. That is,
looking at this feature as a building block of transparent data at
rest encryption such changes might be overkill. Generally encrypting
data using pgcrypto is not good in terms of performance. In
transparent data encryption, PostgreSQL would be able to encrypt data
by the key stored inside its database. As I mentioned above if this
feature can cover a certain use case, it might be enough as is.

>
> Even with these changes, the security of this whole approach can be
> criticized on the basis that a good amount of information about the
> data can be inferred without decrypting anything. You can tell which
> encrypted values are long and which are short. If someone builds an
> index on the column, you can tell the order of all the encrypted
> values even though you may not know what the actual values are. Those
> could well be meaningful information leaks, but I think such a system
> might still be of use for certain purposes.

Yeah, that's another reason why I personally hesitate to use pgcrypto
as a transparent data encryption feature. It's still under discussion
that what data needs to be encrypted by the transparent data at rest
encryption but it would be much better than pgcrypto's one from that
perspective.

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Anastasia Lubennikova 2020-02-19 16:14:03 Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.
Previous Message Hamid Akhtar 2020-02-19 15:04:50 Re: Do we need to handle orphaned prepared transactions in the server?