Re: Proposed patch for key managment

From: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Alastair Turner <minion(at)decodable(dot)me>, Stephen Frost <sfrost(at)snowman(dot)net>, Michael Paquier <michael(at)paquier(dot)xyz>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>
Subject: Re: Proposed patch for key managment
Date: 2020-12-28 22:15:44
Message-ID: alpine.DEB.2.22.394.2012281736570.2094581@pseudo
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


> I want to repeat here what I said in another thread:
>
>> I think ultimately we will need three commands to control the keys.
>> First, there is the cluster_key_command, which we have now. Second, I
>> think we will need an optional command which returns random bytes ---
>> this would allow users to get random bytes from a different source than
>> that used by the server code.
>>
>> Third, we will probably need a command that returns the data encryption
>> keys directly, either heap/index or WAL keys, probably based on key
>> number --- you pass the key number you want, and the command returns the
>> data key. There would not be a cluster key in this case, but the
>> command could still prompt the user for perhaps a password to the KMS
>> server. It could not be used if any of the previous two commands are
>> used. I assume an HMAC would still be stored in the pg_cryptokeys
>> directory to check that the right key has been returned.
>>
>> I thought we should implement the first command, because it will
>> probably be the most common and easiest to use, and then see what people
>> want added.
>
> There is also a fourth option where the command returns multiple keys,
> one per line of hex digits. That could be written in shell script, but
> it would be fragile and complex. It could be written in Perl, but that
> would add a new language requirement for this feature. It could be
> written in C, but that would limits its flexibility because changes
> would require a recompile, and you would probably need that C file to
> call external scripts to tailor input like we do now from the server.
>
> You could actually write a full implemention of what we do on the server
> side in client code, but pg_alterckey would not work, since it would not
> know the data format, so we would need another cluster key alter for that.
>
> It could be written as a C extension, but that would be also have C's
> limitations. In summary, having the server do most of the complex work
> for the default case seems best, and eventually allowing the ability for
> the client to do everything seems ideal. I think we need more input
> before we go beyond what we do now.

As I said in the commit thread, I disagree with this approach because it
pushes for no or partial or maybe bad design.

I think that an API should be carefully thought about, without assumption
about the underlying cryptography (algorithm, key lengths, modes, how keys
are derived and stored, and so on), and its usefulness be demonstrated by
actually being used for one implementation which would be what is
currently being proposed in the patch, and possibly others thrown in for
free.

The implementations should not have to be in any particular language:
Shell, Perl, Python, C should be possible.

After giving it more thought during the day, I think that only one
command and a basic protocol is needed. Maybe something as simple as

/path/to/command --options arguments…

With a basic (text? binary?) protocol on stdin/stdout (?) for the
different functions. What the command actually does (connect to a remote
server, ask for a master password, open some other database, whatever)
should be irrelevant to pg, which would just get and pass bunch of bytes
to functions, which could use them for keys, secrets, whatever, and be
easily replaceable.

The API should NOT make assumptions about the cryptographic design, what
depends about what, where things are stored… ISTM that Pg should only care
about naming keys, holding them when created/retrieved (but not create
them), basically interacting with the key manager, passing the stuff to
functions for encryption/decryption seen as black boxes.

I may have suggested something along these lines at the beginning of the
key management thread, probably. Not going this way implicitely implies
making some assumptions which may or may not suit other use cases, so
makes them specific not generic. I do not think pg should do that.

--
Fabien.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2020-12-29 00:37:20 Re: doc review for v14
Previous Message Peter Geoghegan 2020-12-28 22:06:40 Re: New IndexAM API controlling index vacuum strategies