Re: Transparent column encryption

From: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>
To: Jacob Champion <jchampion(at)timescale(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Transparent column encryption
Date: 2023-01-25 19:00:26
Message-ID: 00b0c4f3-0d9f-dcfd-2ba0-eee5109b4963@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 19.01.23 21:48, Jacob Champion wrote:
> I like the existing "caveats" documentation, and I've attached a sample
> patch with some more caveats documented, based on some of the upthread
> conversation:
>
> - text format makes fixed-length columns leak length information too
> - you only get partial protection against the Evil DBA
> - RSA-OAEP public key safety
>
> (Feel free to use/remix/discard as desired.)

I have added those in the v15 patch I just posted.

> When writing the paragraph on RSA-OAEP I was reminded that we didn't
> really dig into the asymmetric/symmetric discussion. Assuming that most
> first-time users will pick the builtin CMK encryption method, do we
> still want to have an asymmetric scheme implemented first instead of a
> symmetric keywrap? I'm still concerned about that public key, since it
> can't really be made public.

I had started coding that, but one problem was that the openssl CLI
doesn't really provide any means to work with those kinds of keys. The
"openssl enc" command always wants to mix in a password. Without that,
there is no way to write a test case, and more crucially no way for
users to set up these kinds of keys. Unless we write our own tooling
for this, which, you know, the patch just passed 400k in size.

> For the padding caveat:
>
>> + There is no concern if all values are of the same length (e.g., credit
>> + card numbers).
>
> I nodded along to this statement last year, and then this year I learned
> that CCNs aren't fixed-length. So with a 16-byte block, you're probably
> going to be able to figure out who has an American Express card.

Heh. I have removed that parenthetical remark.

> The column encryption algorithm is set per-column -- but isn't it
> tightly coupled to the CEK, since the key length has to match? From a
> layperson perspective, using the same key to encrypt the same plaintext
> under two different algorithms (if they happen to have the same key
> length) seems like it might be cryptographically risky. Is there a
> reason I should be encouraged to do that?

Not really. I was also initially confused by this setup, but that's how
other similar systems are set up, so I thought it would be confusing to
do it differently.

> With the loss of \gencr it looks like we also lost a potential way to
> force encryption from within psql. Any plans to add that for v1?

\gencr didn't do that either. We could do it. The libpq API supports
it. We just need to come up with some syntax for psql.

> While testing, I forgot how the new option worked and connected with
> `column_encryption=on` -- and then I accidentally sent unencrypted data
> to the server, since `on` means "not enabled". :( The server errors out
> after the damage is done, of course, but would it be okay to strictly
> validate that option's values?

fixed in v15

> Are there plans to document client-side implementation requirements, to
> ensure cross-client compatibility? Things like the "PG\x00\x01"
> associated data are buried at the moment (or else I've missed them in
> the docs). If you're holding off until the feature is more finalized,
> that's fine too.

This is documented in the protocol chapter, which I thought was the
right place. Did you want more documentation, or in a different place?

> Speaking of cross-client compatibility, I'm still disconcerted by the
> ability to write the value "hello world" into an encrypted integer
> column. Should clients be required to validate the text format, using
> the attrealtypid?

Well, we can ask them to, but we can't really require them, in a
cryptographic sense. I'm not sure what more we can do.

> It occurred to me when looking at the "unspecified" CMK scheme that the
> CEK doesn't really have to be an encryption key at all. In that case it
> can function more like a (possibly signed?) cookie for lookup, or even
> be ignored altogether if you don't want to use a wrapping scheme
> (similar to JWE's "direct" mode, maybe?). So now you have three ways to
> look up or determine a column encryption key (CMK realm, CMK name, CEK
> cookie)... is that a concept worth exploring in v1 and/or the documentation?

I don't completely follow this.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Aleksander Alekseev 2023-01-25 19:00:50 Re: [PATCH] Make ON CONFLICT DO NOTHING and ON CONFLICT DO UPDATE consistent
Previous Message Peter Eisentraut 2023-01-25 18:50:05 Re: Transparent column encryption