Re: Transparent column encryption

From: Jehan-Guillaume de Rorthais <jgdr(at)dalibo(dot)com>
To: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Transparent column encryption
Date: 2022-10-28 10:16:29
Message-ID: 20221028121629.6467d8f9@karst
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

I did a review of the documentation and usability.

# Applying patch

The patch applied on top of f13b2088fa2 without trouble. Notice a small
warning during compilation:

colenccmds.c:134:27: warning: ‘encval’ may be used uninitialized

A simple fix could be:

+++ b/src/backend/commands/colenccmds.c
@@ -119,2 +119,3
encval = defGetString(encvalEl);
+ *encval_p = encval;
}
@@ -132,4 +133,2
*alg_p = alg;
- if (encval_p)
- *encval_p = encval;
}

# Documentation

* In page "create_column_encryption_key.sgml", both encryption algorithms for
a CMK are declared as the default one:

+ <para>
+ The encryption algorithm that was used to encrypt the key material of
+ this column encryption key. Supported algorithms are:
+ <itemizedlist>
+ <listitem>
+ <para><literal>RSAES_OAEP_SHA_1</literal> (default)</para>
+ </listitem>
+ <listitem>
+ <para><literal>RSAES_OAEP_SHA_256</literal> (default)</para>
+ </listitem>
+ </itemizedlist>
+ </para>

As far as I understand the code, I suppose RSAES_OAEP_SHA_1 should be the
default.

I believe two information should be clearly shown to user somewhere in
chapter 5.5 instead of being buried deep in documentation:

* «COPY does not support column decryption», currently buried in pg_dump page
* «When transparent column encryption is enabled, the client encoding must
match the server encoding», currently buried in the protocol description
page.

* In the libpq doc of PQexecPrepared2, "paramForceColumnEncryptions" might
deserve a little more detail about the array format, like «0 means "don't
enforce" and anything else enforce the encryption is enabled on this
column». By the way, maybe this array could be an array of boolean?

* In chapter 55.2.5 (protocol-flow) is stated: «when column encryption is
used, the plaintext is always in text format (not binary format).». Does it
means parameter "resultFormat" in "PQexecPrepared2" should always be 0? If
yes, is it worth keeping this argument? Moreover, this format constraint
should probably be explained in the libpq page as well.

# Protocol

* In the ColumnEncryptionKey message, it seems the field holding the length
key material is redundant with the message length itself, as all other
fields have a known size. The key material length is the message length -
(4+4+4+2). For comparison, the AuthenticationSASLContinue message has a
variable data size but rely only on the message length without additional
field.

* I wonder if encryption related fields in ParameterDescription and
RowDescription could be optional somehow? The former might be quite large
when using a lot of parameters (like, imagine a large and ugly
"IN($1...$N)"). On the other hand, these messages are not sent in high
frequency anyway...

# libpq

Would it be possible to have an encryption-ready PQexecParams() equivalent
of what PQprepare/describe/exec do?

# psql

* You already mark \d in the TODO. But having some psql command to list the
known CEK/CMK might be useful as well.

* about write queries using psql, would it be possible to use psql
parameters? Eg.:

=> \set c1 myval
=> INSERT INTO mytable VALUES (:'c1') \gencr

# Manual tests

* The lookup error message is shown twice for some reason:

=> select * from test_tce;
no CMK lookup found for realm ""

no CMK lookup found for realm ""

It might worth adding the column name and the CMK/CEK names related to each
error message? Last, notice the useless empty line between messages.

* When "DROP IF EXISTS" a missing CEK or CMK, the command raise an
"unrecognized object type":

=> DROP COLUMN MASTER KEY IF EXISTS noexists;
ERROR: unrecognized object type: 10
=> DROP COLUMN ENCRYPTION KEY IF EXISTS noexists;
ERROR: unrecognized object type: 8

* I was wondering what "pg_typeof" should return. It currently returns
"pg_encrypted_*". If this is supposed to be transparent from the client
perspective, shouldn't it return "attrealtypid" when the field is encrypted?

* any reason to not support altering the CMK realm?

This patch is really interesting and would be a nice addition to the core.

Thanks!

Regards,

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2022-10-28 10:17:10 Re: Use LIMIT instead of Unique for DISTINCT when all distinct pathkeys are redundant
Previous Message Etsuro Fujita 2022-10-28 10:12:52 Re: Fast COPY FROM based on batch insert