Re: Fast Search on Encrypted Feild

From: Bill Moran <wmoran(at)potentialtech(dot)com>
To: "Naoko Reeves" <naoko(at)lawlogix(dot)com>
Cc: "Merlin Moncure" <mmoncure(at)gmail(dot)com>, <pgsql-general(at)postgresql(dot)org>
Subject: Re: Fast Search on Encrypted Feild
Date: 2009-11-14 23:40:34
Message-ID: 20091114184034.acadeed8.wmoran@potentialtech.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

"Naoko Reeves" <naoko(at)lawlogix(dot)com> wrote:
>
> Merlin,
> Thank you for your quick response. I see... our security requirements are:
> We are encrypting PII information within our DB and because of the sensitive nature of our data, we must balancing both performance and security to meet our client requirements.
> Our clients are mainly lawyers and handles clients case (government, healthcare, education).
> If you could provide me any advice that would be great otherwise I understand that I have to go without wildcard search.

For the most part, you have either performance or security, but not both.
As others have pointed out, anything you do to speed up searches basically
results in storing an unencrypted version of the data in an index.

Storing the data redundantly is often worthwhile. For example, in the US,
it's generally considered proper to store the SSN encrypted, because it's
a target for identity theft, but the last 4 digits of the SSN aren't
considered enough information to steal someone's identity, so they are
usually stored unencrypted, and thus provide fast searching. Since this
convention is so common in the US, most people search on the last 4 digits
anyway, so it works out well.

Another option that might be worthwhile is deidentifying the data, which is
a practice often done for HIPPA-protected information. I was hoping to
point you toward a good reference on how to do this on the web, but Google
is failing me.

The basic technique is to store the protected data in a different table, then
encrypt the key that links those two tables together. Of course, the database
can no longer enforce referential integrity at that point, and it's totally
on your application to manage that. By doing so, you get speedy, indexed
searches, and the decryption overhead is only felt when it's time to
decrypt the foreign key and link the data. For a query that's selecting
50 rows out of 50,000, this is a huge win, and in the worst case, it's still
no worse than encrypting the data itself.

--
Bill Moran
http://www.potentialtech.com

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Bill Todd 2009-11-15 02:37:53 Re: safelly erasing dirs/files
Previous Message Thom Brown 2009-11-14 22:51:03 Re: pgday.eu