Re: Proposed patch for key management

From: Alastair Turner <minion(at)decodable(dot)me>
To: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Bruce Momjian <bruce(at)momjian(dot)us>, Michael Paquier <michael(at)paquier(dot)xyz>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>
Subject: Re: Proposed patch for key management
Date: 2021-01-02 12:47:19
Message-ID: CAC0GmywkOGhO3XaoEJhHGvXXq5NaxY+C0teh50vDyZ=+gUhOUg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Fabien

On Sat, 2 Jan 2021 at 09:50, Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr> wrote:
>
...
> ISTM that pg at the core level should (only) directly provide:
>
> (1) a per-file encryption scheme, with loadable (hook-replaceable
> functions??) to manage pages, maybe:
>
> encrypt(page_id, *key, *clear_page, *encrypted_page);
> decrypt(page_id, *key, *encrypted_page, *clear_page);
>
> What encrypt/decrypt does is beyond pg core stuff. Ok, a reasonable
> implementation should be provided, obviously, possibly in core. There may
> be some block-size issues if not full pages are encrypted, so maybe the
> interface needs to be a little more subtle.
>

There are a lot of specifics of the encryption implementation which
need to be addressed in future patches. This patch focuses on making
keys available to the encryption processes at run-time, so ...

>
> (2) offer a key management scheme interface, to manage *per-file* keys,
> possibly loadable (hook replaceable?). If all files have the same key,
> which is stored in a directory and encoded with a KEK, this is just one
> (debatable) implementation choice. For that, ISTM that what is needed at
> this level is:
>
> get_key(file_id (relative name? oid? 8 or 16 bytes something?));
>

Per-cluster keys for permanent data and WAL allow a useful level of
protection, even if it could be improved upon. It's also going to be
quicker/simpler to implement, so any API should allow for it. If
there's an arbitrary number of DEK's, using a scope label for
accessing them sounds right, so "WAL", "local_data",
"local_data/tablespaceiod" or "local_data/dboid/tableoid".

>
...
> (3) ISTM that the key management interface should be external, or at least
> it should be possible to make it external easily. I do not think that
> there is a significant performance issue because keys are needed once, and
> once loaded they are there. A simple way to do that is a separate process
> with a basic protocol on stdin/stdout to implement "get_key", which is
> basically already half implemented in the patch for retrieving the KEK.
>

If keys can have arbitrary scope, then the pg backend won't know what
to ask for. So the API becomes even simpler with no specific request
on stdin and all the relevant keys on stdout. I generally like this
approach as well, and it will be the only option for some
integrations. On the other hand, there is an advantage to having the
key retrieval piece of key management in-process - the keys are not
being passed around in plain.

There is also a further validation task - probably beyond the scope of
the key management patch and into the encryption patch[es] territory -
checking that the keys supplied are the same keys in use for the data
currently on disk. It feels to me like this should be done at startup,
rather than as each file is accessed, which could make startup quite
slow if there are a lot of keys with narrow scope.

Regards
Alastair

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dmitry Dolgov 2021-01-02 14:14:11 Re: [HACKERS] [PATCH] Generic type subscripting
Previous Message Luc Vlaming 2021-01-02 10:09:20 Re: faster ETL / bulk data load for heap tables