Re: Internal key management system

From: Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>
To: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Cary Huang <cary(dot)huang(at)highgo(dot)ca>, Ahsan Hadi <ahsan(dot)hadi(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "Moon, Insung" <tsukiwamoon(dot)pgsql(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Sehrope Sarkuni <sehrope(at)jackdb(dot)com>, cary huang <hcary328(at)gmail(dot)com>, Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>, Joe Conway <mail(at)joeconway(dot)com>
Subject: Re: Internal key management system
Date: 2020-07-31 07:06:38
Message-ID: CA+fd4k6RJwNvZTro3q2f5HSDd8HgyUc4CuY9U3e6Ran4C6TO4g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 23 Jun 2020 at 22:46, Masahiko Sawada
<masahiko(dot)sawada(at)2ndquadrant(dot)com> wrote:
>
> On Fri, 19 Jun 2020 at 15:44, Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr> wrote:
> >
> >
> > Hello Masahiko-san,
> >
> > > What I referred to "only one key" is KEK.
> >
> > Ok, sorry, I misunderstood.
> >
> > >>> Yeah, it depends on KMS, meaning we need different extensions for
> > >>> different KMS. A KMS might support an interface that creates key if not
> > >>> exist during GET but some KMS might support CREATE and GET separately.
> > >>
> > >> I disagree that it is necessary, but this is debatable. The KMS-side
> > >> interface code could take care of that, eg:
> > >>
> > >> if command is "get X"
> > >> if (X does not exist in KMS)
> > >> create a new key stored in KMS, return it;
> > >> else
> > >> return KMS-stored key;
> > >> ...
> > >>
> > >> So you can still have a "GET" only interface which adapts to the "final"
> > >> KMS. Basically, the glue code which implements the interface for the KMS
> > >> can include some logic to adapt to the KMS point of view.
> > >
> > > Is the above code is for the extension side, right?
> >
> > Such a code could be in the command with which pg communicates (eg through
> > its stdin/stdout, or whatever) to get keys.
> >
> > pg talks to the command, the command can do anything, such as storing keys
> > or communicating with an external service to retrieve them, anything
> > really, that is the point.
> >
> > I'm advocating defining the pg/command protocol, something along "GET xxx"
> > as you wrote, and possibly provide a possible/reasonable command
> > implementation, which would be part of the code you put in your patch,
> > only it would be in the command instead of postgres.
> >
> > > For example, if users want to use a cloud KMS, say AWS KMS, to store
> > > DEKs and KEK they need an extension that is loaded to postgres and can
> > > communicate with AWS KMS. I imagine that such extension needs to be
> > > written in C,
> >
> > Why? I could write it in bash, probably. Ok, maybe not so good for suid,
> > but in principle it could be anything. I'd probably write it in C, though.
> >
> > > the communication between the extension uses AWS KMS API, and the
> > > communication between postgres core and the extension uses text
> > > protocol.
> >
> > I'm not sure of the word "extension" above. For me the postgres side could
> > be an extension as in "CREATE EXTENSION". The command itself could be
> > provided in the extension code, but would not be in the "CREATE
> > EXTENSION", it would be something run independently.
>
> Oh, I imagined extensions that can be installed by CREATE EXTENSION or
> specifying it to shared_preload_libraries.
>
> If the command runs in the background to talk with postgres there are
> some problems that we need to deal with. For instance, what if the
> process downs? Does the postmaster re-execute it? How does it work in
> single-user mode? etc. It seems to me it will bring another
> complexity.
>
> >
> > > When postgres core needs a DEK identified by KEY-A, it asks
> > > for the extension to get the DEK by passing something like “GET KEY-A”
> > > message, and then the extension asks the existence of that key to AWK
> > > KMS, creates if not exist and returns it to the postgres core. Is my
> > > understanding right?
> >
> > Yes. The command in the use-case you outline would just be an
> > intermediary, but for another use-case it would do the storing. The point
> > of aiming at extensibility if that from pg point of view the external
> > commands provide keys, but what these commands really do to do this can be
> > anything.
> >
> > > When we have TDE feature in the future, we would also need to change
> > > frontend tools such as pg_waldump and pg_rewind that read database
> > > files so that they can read encrypted files. It means that these
> > > frond-end tools also somehow need to communicate with the external
> > > place to get DEKs in order to decrypt encrypted database files. In
> > > your idea, what do you think about how we can support it?
> >
> > Hmmm. My idea was that the natural interface would be to get things
> > through postgres. For a debug tool such as pg_waldump, probably it needs
> > to be adapted if it needs to decrypt data to operate.
> >
> > Now I'm not sure I understood, because of the DEK are managed by postgres
> > in your patch, waldump and other external commands would have no access to
> > the decrypted data anyway, so the issue would be the same?
>
> With the current patch, we will be able to add
> --cluster-passphase-command command-line option to front-end tools
> that want to read encrypted database files. The front-end tools
> execute the specified command to get KEK and unwrap DEKs. The
> functions such as running passphrase command, verifying the passphrase
> is correct, and getting wrapped DEKs from the database cluster are
> implemented in src/common so both can use these functions.
>
> >
> > With file-level encryption, obviously all commands which have to read and
> > understand the files have to be adapted if they are to still work, which
> > is another argument to have some interface rather than integrated
> > server-side stuff, because these external commands would need to be able
> > to get keys and use them as well.
> >
> > Or I misunderstood something.
> >
> > >> I'd like an extensible design to have anything in core. As I said in an
> > >> other mail, if you want to handle a somehow restricted use case, just
> > >> provide an external extension and do nothing in core, please. Put in core
> > >> something that people with a slightly different use case or auditor can
> > >> build on as well. The current patch makes a dozen hard-coded decisions
> > >> which it should not, IMHO.
> > >
> > > It might have confused you that I included key manager and encryption
> > > SQL functions to the patches but this key manager has been designed
> > > dedicated to only TDE.
> >
> > Hmmm. This is NOT AT ALL what the patch does. The documentation in your
> > patch talks about "column level encryption", which is an application
> > thing. Now you seem to say that it does not matter and can be removed
> > because the use case is elsewhere.
> >
> > > It might be better to remove both SQL interface
> > > and SQL key we discussed from the patch set as they are actually not
> > > necessary for TDE purposes.
> >
> > The documentation part of the patch, at no point, talks about TDE
> > (transparent data encryption), which is a file-level encryption as far as
> > I understand it, i.e. whole files are encrypted.
>
> I should have described the positioning of these patches.. The current
> patch is divided into 7 patches but there are two patches for
> different purposes.
>
> The first patch is to add an internal key manager. This is
> PostgreSQL’s internal component to manage cryptographic keys mainly
> for TDE. Originally I proposed both key manager and TDE[1] but these
> are actually independent each other and we can discuss them
> separately. Therefore, I started a new thread to discuss only the key
> manager. According to the discussion so far, since TDE doesn't need to
> dynamically register DEKs the key manager doesn't have an interface to
> register DEK for now. We cannot do anything with only the first patch
> but the plan is to implement TDE on top of the key manager. So we can
> implement the key manager patch and TDE patch separately but these
> actually depend on each other.
>
> The second patch is to make the key manager use even without TDE. The
> idea of this patch is to help the incremental development of TDE.
> There was a discussion that since the development of both the key
> manager and TDE will take a long time, it’s better to make the key
> manager work alone by providing SQL interfaces to it. This patch adds
> new SQL functions: pg_wrap() and pg_unwrap(), to wrap and unwrap the
> arbitrary user secret by the encryption key called SQL internal key
> which is managed and stored by the key manager. What this patch does
> is to register the SQL internal key to the key manager and add SQL
> functions that wrap and unwrap the given string in the same way of key
> wrapping used by the key manager. These functions renamed to
> pg_encrypt() and pg_decrypt().
>
> Given that the purpose of the key manager is to help TDE, discussing
> the SQL interface part (i.g., the second patch) deviates from the
> original purpose. I think we should discuss the design and
> implementation of the key manager first and then other additional
> interfaces. So I’ve attached a new version patch and removed the
> second patch part so that we can focus on only the key manager part.
>

Since the previous patch sets conflicts with the current HEAD, I've
attached the rebased patch set.

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment Content-Type Size
v13-0001-Add-encryption-functions-for-both-frontend-and-b.patch application/octet-stream 12.8 KB
v13-0003-Documentation-update.patch application/octet-stream 22.5 KB
v13-0002-Add-key-management-module.patch application/octet-stream 58.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Gustafsson 2020-07-31 07:44:27 Re: Ought to use heap_multi_insert() for pg_attribute/depend insertions?
Previous Message Kasahara Tatsuhito 2020-07-31 06:26:44 Re: autovac issue with large number of tables