Re: Transparent column encryption

From: Jacob Champion <jchampion(at)timescale(dot)com>
To: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Transparent column encryption
Date: 2023-01-19 20:48:03
Message-ID: 5003d222-5975-38c1-e471-888e642f23aa@timescale.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 12/31/22 06:17, Peter Eisentraut wrote:
> On 21.12.22 06:46, Peter Eisentraut wrote:
>> And another update.  The main changes are that I added an 'unspecified'
>> CMK algorithm, which indicates that the external KMS knows what it is
>> but the database system doesn't.  This was discussed a while ago.  I
>> also changed some details about how the "cmklookup" works in libpq. Also
>> added more code comments and documentation and rearranged some code.

Trying to delay a review until I had "completed it" has only led to me
not reviewing, so here's a partial one. Let me know what pieces of the
implementation and/or architecture you're hoping to get more feedback on.

I like the existing "caveats" documentation, and I've attached a sample
patch with some more caveats documented, based on some of the upthread
conversation:

- text format makes fixed-length columns leak length information too
- you only get partial protection against the Evil DBA
- RSA-OAEP public key safety

(Feel free to use/remix/discard as desired.)

When writing the paragraph on RSA-OAEP I was reminded that we didn't
really dig into the asymmetric/symmetric discussion. Assuming that most
first-time users will pick the builtin CMK encryption method, do we
still want to have an asymmetric scheme implemented first instead of a
symmetric keywrap? I'm still concerned about that public key, since it
can't really be made public. (And now that "unspecified" is available, I
think an asymmetric CMK could be easily created by users that have a
niche use case, and then we wouldn't have to commit to supporting it
forever.)

For the padding caveat:

> + There is no concern if all values are of the same length (e.g., credit
> + card numbers).

I nodded along to this statement last year, and then this year I learned
that CCNs aren't fixed-length. So with a 16-byte block, you're probably
going to be able to figure out who has an American Express card.

The column encryption algorithm is set per-column -- but isn't it
tightly coupled to the CEK, since the key length has to match? From a
layperson perspective, using the same key to encrypt the same plaintext
under two different algorithms (if they happen to have the same key
length) seems like it might be cryptographically risky. Is there a
reason I should be encouraged to do that?

With the loss of \gencr it looks like we also lost a potential way to
force encryption from within psql. Any plans to add that for v1?

While testing, I forgot how the new option worked and connected with
`column_encryption=on` -- and then I accidentally sent unencrypted data
to the server, since `on` means "not enabled". :( The server errors out
after the damage is done, of course, but would it be okay to strictly
validate that option's values?

Are there plans to document client-side implementation requirements, to
ensure cross-client compatibility? Things like the "PG\x00\x01"
associated data are buried at the moment (or else I've missed them in
the docs). If you're holding off until the feature is more finalized,
that's fine too.

Speaking of cross-client compatibility, I'm still disconcerted by the
ability to write the value "hello world" into an encrypted integer
column. Should clients be required to validate the text format, using
the attrealtypid?

It occurred to me when looking at the "unspecified" CMK scheme that the
CEK doesn't really have to be an encryption key at all. In that case it
can function more like a (possibly signed?) cookie for lookup, or even
be ignored altogether if you don't want to use a wrapping scheme
(similar to JWE's "direct" mode, maybe?). So now you have three ways to
look up or determine a column encryption key (CMK realm, CMK name, CEK
cookie)... is that a concept worth exploring in v1 and/or the documentation?

Thanks,
--Jacob

Attachment Content-Type Size
caveats.diff.txt text/plain 2.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Vladimir Sitnikov 2023-01-19 20:49:30 Re: Experiments with Postgres and SSL
Previous Message Andres Freund 2023-01-19 20:45:49 Re: meson oddities