Re: Transparent column encryption

From: Jacob Champion <jchampion(at)timescale(dot)com>
To: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Transparent column encryption
Date: 2023-01-30 22:30:32
Message-ID: CAAWbhmibyvZXu3P7vU8FjtUN-Jx86udDVjkt=fNQLuQgOBniXQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jan 25, 2023 at 11:00 AM Peter Eisentraut
<peter(dot)eisentraut(at)enterprisedb(dot)com> wrote:
> > When writing the paragraph on RSA-OAEP I was reminded that we didn't
> > really dig into the asymmetric/symmetric discussion. Assuming that most
> > first-time users will pick the builtin CMK encryption method, do we
> > still want to have an asymmetric scheme implemented first instead of a
> > symmetric keywrap? I'm still concerned about that public key, since it
> > can't really be made public.
>
> I had started coding that, but one problem was that the openssl CLI
> doesn't really provide any means to work with those kinds of keys. The
> "openssl enc" command always wants to mix in a password. Without that,
> there is no way to write a test case, and more crucially no way for
> users to set up these kinds of keys. Unless we write our own tooling
> for this, which, you know, the patch just passed 400k in size.

Arrgh: https://github.com/openssl/openssl/issues/10605

> > The column encryption algorithm is set per-column -- but isn't it
> > tightly coupled to the CEK, since the key length has to match? From a
> > layperson perspective, using the same key to encrypt the same plaintext
> > under two different algorithms (if they happen to have the same key
> > length) seems like it might be cryptographically risky. Is there a
> > reason I should be encouraged to do that?
>
> Not really. I was also initially confused by this setup, but that's how
> other similar systems are set up, so I thought it would be confusing to
> do it differently.

Which systems let you mix and match keys and algorithms this way? I'd
like to take a look at them.

> > With the loss of \gencr it looks like we also lost a potential way to
> > force encryption from within psql. Any plans to add that for v1?
>
> \gencr didn't do that either. We could do it. The libpq API supports
> it. We just need to come up with some syntax for psql.

Do you think people would rather set encryption for all parameters at
once -- something like \encbind -- or have the ability to mix
encrypted and unencrypted parameters?

> > Are there plans to document client-side implementation requirements, to
> > ensure cross-client compatibility? Things like the "PG\x00\x01"
> > associated data are buried at the moment (or else I've missed them in
> > the docs). If you're holding off until the feature is more finalized,
> > that's fine too.
>
> This is documented in the protocol chapter, which I thought was the
> right place. Did you want more documentation, or in a different place?

I just missed it; sorry.

> > Speaking of cross-client compatibility, I'm still disconcerted by the
> > ability to write the value "hello world" into an encrypted integer
> > column. Should clients be required to validate the text format, using
> > the attrealtypid?
>
> Well, we can ask them to, but we can't really require them, in a
> cryptographic sense. I'm not sure what more we can do.

Right -- I just mean that clients need to pay more attention to it
now, whereas before they may have delegated correctness to the server.
The problem is documented in the context of deterministic encryption,
but I think it applies to randomized as well.

More concretely: should psql allow you to push arbitrary text into an
encrypted \bind parameter, like it does now?

> > It occurred to me when looking at the "unspecified" CMK scheme that the
> > CEK doesn't really have to be an encryption key at all. In that case it
> > can function more like a (possibly signed?) cookie for lookup, or even
> > be ignored altogether if you don't want to use a wrapping scheme
> > (similar to JWE's "direct" mode, maybe?). So now you have three ways to
> > look up or determine a column encryption key (CMK realm, CMK name, CEK
> > cookie)... is that a concept worth exploring in v1 and/or the documentation?
>
> I don't completely follow this.

Yeah, I'm not expressing it very well. My feeling is that the
organization system here -- a realm "contains" multiple CMKs, a CMK
encrypts multiple CEKs -- is so general and flexible that it may need
some suggested guardrails for people to use it sanely. I just don't
know what those guardrails should be. I was motivated by the
realization that CEKs don't even need to be keys.

Thanks,
--Jacob

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Smith 2023-01-30 22:36:53 Re: pub/sub - specifying optional parameters without values.
Previous Message Robert Haas 2023-01-30 22:21:22 Re: postgres_fdw, dblink, and CREATE SUBSCRIPTION security