Re: Transparent column encryption

From: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>
To: Jehan-Guillaume de Rorthais <jgdr(at)dalibo(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Transparent column encryption
Date: 2022-11-23 18:45:06
Message-ID: a9381ab5-234a-f796-f43e-ebf84c3922f8@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 28.10.22 12:16, Jehan-Guillaume de Rorthais wrote:
> I did a review of the documentation and usability.

I have incorporated some of your feedback into the v11 patch I just posted.

> # Applying patch
>
> The patch applied on top of f13b2088fa2 without trouble. Notice a small
> warning during compilation:
>
> colenccmds.c:134:27: warning: ‘encval’ may be used uninitialized
>
> A simple fix could be:
>
> +++ b/src/backend/commands/colenccmds.c
> @@ -119,2 +119,3
> encval = defGetString(encvalEl);
> + *encval_p = encval;
> }
> @@ -132,4 +133,2
> *alg_p = alg;
> - if (encval_p)
> - *encval_p = encval;
> }

fixed

> # Documentation
>
> * In page "create_column_encryption_key.sgml", both encryption algorithms for
> a CMK are declared as the default one:
>
> + <para>
> + The encryption algorithm that was used to encrypt the key material of
> + this column encryption key. Supported algorithms are:
> + <itemizedlist>
> + <listitem>
> + <para><literal>RSAES_OAEP_SHA_1</literal> (default)</para>
> + </listitem>
> + <listitem>
> + <para><literal>RSAES_OAEP_SHA_256</literal> (default)</para>
> + </listitem>
> + </itemizedlist>
> + </para>
>
> As far as I understand the code, I suppose RSAES_OAEP_SHA_1 should be the
> default.

fixed

> I believe two information should be clearly shown to user somewhere in
> chapter 5.5 instead of being buried deep in documentation:
>
> * «COPY does not support column decryption», currently buried in pg_dump page
> * «When transparent column encryption is enabled, the client encoding must
> match the server encoding», currently buried in the protocol description
> page.
>
> * In the libpq doc of PQexecPrepared2, "paramForceColumnEncryptions" might
> deserve a little more detail about the array format, like «0 means "don't
> enforce" and anything else enforce the encryption is enabled on this
> column». By the way, maybe this array could be an array of boolean?
>
> * In chapter 55.2.5 (protocol-flow) is stated: «when column encryption is
> used, the plaintext is always in text format (not binary format).». Does it
> means parameter "resultFormat" in "PQexecPrepared2" should always be 0? If
> yes, is it worth keeping this argument? Moreover, this format constraint
> should probably be explained in the libpq page as well.

I will keep these suggestions around. Some of these things will
probably change again, so I'll make sure to update the documentation
when I touch it again.

> # Protocol
>
> * In the ColumnEncryptionKey message, it seems the field holding the length
> key material is redundant with the message length itself, as all other
> fields have a known size. The key material length is the message length -
> (4+4+4+2). For comparison, the AuthenticationSASLContinue message has a
> variable data size but rely only on the message length without additional
> field.

I find that weird, though. An explicit length seems better. Things
like AuthenticationSASLContinue only work if they have exactly one
variable-length data item.

> * I wonder if encryption related fields in ParameterDescription and
> RowDescription could be optional somehow? The former might be quite large
> when using a lot of parameters (like, imagine a large and ugly
> "IN($1...$N)"). On the other hand, these messages are not sent in high
> frequency anyway...

They are only used if you turn on the column_encryption protocol option.
Or did you mean make them optional even then?

> # libpq
>
> Would it be possible to have an encryption-ready PQexecParams() equivalent
> of what PQprepare/describe/exec do?

I plan to do that.

> # psql
>
> * You already mark \d in the TODO. But having some psql command to list the
> known CEK/CMK might be useful as well.

right

> * about write queries using psql, would it be possible to use psql
> parameters? Eg.:
>
> => \set c1 myval
> => INSERT INTO mytable VALUES (:'c1') \gencr

No, because those are resolved by psql before libpq sees them.

> # Manual tests
>
> * The lookup error message is shown twice for some reason:
>
> => select * from test_tce;
> no CMK lookup found for realm ""
>
> no CMK lookup found for realm ""
>
> It might worth adding the column name and the CMK/CEK names related to each
> error message? Last, notice the useless empty line between messages.

I'll look into that.

> * When "DROP IF EXISTS" a missing CEK or CMK, the command raise an
> "unrecognized object type":
>
> => DROP COLUMN MASTER KEY IF EXISTS noexists;
> ERROR: unrecognized object type: 10
> => DROP COLUMN ENCRYPTION KEY IF EXISTS noexists;
> ERROR: unrecognized object type: 8

fixed

> * I was wondering what "pg_typeof" should return. It currently returns
> "pg_encrypted_*". If this is supposed to be transparent from the client
> perspective, shouldn't it return "attrealtypid" when the field is encrypted?

Interesting question. Need to think about it. I'm not sure what the
purpose of pg_typeof really is. The only use I can recall is for pgTAP.

> * any reason to not support altering the CMK realm?

This could be added. I have that in my notes.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2022-11-23 18:47:33 Re: Document parameter count limit
Previous Message Dean Rasheed 2022-11-23 18:43:58 Re: Another multi-row VALUES bug