Re: Transparent column encryption

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Transparent column encryption
Date: 2021-12-06 18:28:28
Message-ID: CA+Tgmob_xikGmJg9gYjc7fAX3a9RzAZhnziKqoAQgdfNjCADjw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Dec 3, 2021 at 4:32 PM Peter Eisentraut
<peter(dot)eisentraut(at)enterprisedb(dot)com> wrote:
> But it's missing the remaining 90% of the work,
> including additional DDL support, error handling, robust memory
> management, protocol versioning, forward and backward compatibility,
> pg_dump support, psql \d support, refinement of the cryptography, and
> so on. But I think obvious solutions exist to all of those things, so
> it isn't that interesting to focus on them for now.

Right, we wouldn't want to get bogged down at this stage in little
details like, uh, everything.

> Some protocol extensions are required. These should be guarded by
> some _pq_... setting, but this is not done in this patch yet. As
> mentioned above, extra messages are added for sending the CMKs and
> CEKs. In the RowDescription message, I have commandeered the format
> field to add a bit that indicates that the field is encrypted. This
> could be made a separate field, and there should probably be
> additional fields to indicate the algorithm and CEK name, but this was
> easiest for now. The ParameterDescription message is extended to
> contain format fields for each parameter, for the same purpose.
> Again, this could be done differently.

I think this is reasonable. I would choose to use an additional bit in
the format field as opposed to a separate field. It is worth
considering whether it makes more sense to extend the existing
ParameterDescription message conditionally on some protocol-level
option, or whether we should instead, say, add ParameterDescription2
or the moral equivalent. As I see it, the latter feels conceptually
simpler, but on the other hand, our wire protocol supposes that we
will never run out of 1-byte codes for messages, so perhaps some
prudence is needed.

> Speaking of parameter descriptions, the trickiest part of this whole
> thing appears to be how to get transparently encrypted data into the
> database (as opposed to reading it out). It is required to use
> protocol-level prepared statements (i.e., extended query) for this.

Why? If the client knows the CEK, can't the client choose to send
unprepared insert or update statements with pre-encrypted blobs? That
might be a bad idea from a security perspective because the encrypted
blob might then got logged, but we sometimes log parameters, too.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2021-12-06 18:55:50 Re: PostgreSQL server: authentication method 10 not supported
Previous Message Robert Haas 2021-12-06 17:55:04 Re: preserving db/ts/relfilenode OIDs across pg_upgrade (was Re: storing an explicit nonce)