Re: Transparent column encryption

From: Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>
To: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Transparent column encryption
Date: 2023-01-10 19:47:11
Message-ID: 61E7BB91-45B2-48AF-B52F-F114092A06BE@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On Jan 10, 2023, at 9:26 AM, Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com> wrote:
>
> -- Cryptographically connected to the encrypted record
> patient_id BIGINT NOT NULL,
> patient_ssn CHAR(11),
>
> -- The encrypted record
> patient_record TEXT ENCRYPTED WITH (column_encryption_key = cek1,
> column_encryption_salt = (patient_id, patient_ssn)),

As you mention upthread, tying columns together creates problems for statements that only operate on a subset of columns. Allowing schema designers a choice about tying the encrypted column to zero or more other columns allows them to choose which works best for their security needs.

The example above would make a statement like "UPDATE patient_record SET patient_record = $1 \bind '{some json whatever}'" raise an exception at the libpq client level, but maybe that's what schema designers wants it to do. If not, they should omit the column_encryption_salt option in the create table statement; but if so, they should expect to have to specify the other columns as part of the update statement, possibly as part of the where clause, like

UPDATE patient_record
SET patient_record = $1
WHERE patient_id = 12345
AND patient_ssn = '111-11-1111'
\bind '{some json record}'

and have the libpq get the salt column values from the where clause (which may be tricky to implement), or perhaps use some new bind syntax like

UPDATE patient_record
SET patient_record = ($1:$2,$3) -- new, wonky syntax
WHERE patient_id = $2
AND patient_ssn = $3
\bind '{some json record}' 12345 '111-11-1111'

which would be error prone, since the sql statement could specify the ($1:$2,$3) inconsistently with the where clause, or perhaps specify the "new" salt columns even when not changed, like

UPDATE patient_record
SET patient_record = $1, patient_id = 12345, patient_ssn = "111-11-1111"
WHERE patient_id = 12345
AND patient_ssn = "111-11-1111"
\bind '{some json record}'

which looks kind of nuts at first glance, but is grammatically consistent with cases where one or both of the patient_id or patient_ssn are also being changed, like

UPDATE patient_record
SET patient_record = $1, patient_id = 98765, patient_ssn = "999-99-9999"
WHERE patient_id = 12345
AND patient_ssn = "111-11-1111"
\bind '{some json record}'

Or, of course, you can ignore these suggestions or punt them to some future patch that extends the current work, rather than trying to get it all done in the first column encryption commit. But it seems useful to think about what future directions would be, to avoid coding ourselves into a corner, making such future work harder.


Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2023-01-10 19:47:41 Re: Fixing a couple of buglets in how VACUUM sets visibility map bits
Previous Message Jacob Champion 2023-01-10 19:36:12 Re: RFC: logical publication via inheritance root?