Re: Fast Search on Encrypted Feild

From: Alban Hertroys <dalroi(at)solfertje(dot)student(dot)utwente(dot)nl>
To: Naoko Reeves <naoko(at)lawlogix(dot)com>
Cc: <pgsql-general(at)postgresql(dot)org>
Subject: Re: Fast Search on Encrypted Feild
Date: 2009-11-15 11:54:29
Message-ID: 45F5F72F-5379-4EC0-9879-64ED8A709EC4@solfertje.student.utwente.nl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 14 Nov 2009, at 22:27, Naoko Reeves wrote:

> I have a encrypted column use encrypt function.
> Querying against this column is almost not acceptable – returning 12 rows took 25,908 ms.
> The query was simply Select decrypt(phn_phone_enc) FROM phn WHERE decrypt(phn_phone_enc,’xxx’,’xxx’) LIKE ‘123%’
> So I built index like: CREATE INDEX idx_phn_phone_dec ON phn (decrypt(phn_phone_enc, ‘xxx’, ‘xxx’))
> This returns 12 rows in 68 ms.
> Would this be the solution for the fast encrypted field search or does this raise the security issue?

Is there some way you can invert the process?
Normally if you verify encrypted data (typically passwords) you would encrypt the user-specified data and compare that to the encrypted data in the database instead of decrypting both and comparing the actual data. I doubt you can do that with partial data though, and since you're showing a LIKE expression here...

One thing I notice in your query is that you're decrypting your data twice; you're calling two different functions for the same purpose. You may notice a speed-up if you use decrypt(text) instead of decrypt(text, text, text) in your where clause, provided that the function is defined stable[1] or immutable[2] as opposed to the default volatile[3].

Or is decrypting only a part of the encrypted string significantly faster? That would imply some linear encryption algorithm in which case you may be able to use my initial suggestion: Encrypting '123' would create something that's comparable to the first 3 encrypted characters of an encrypted phone number. A query would look like: SELECT decrypt(phn_phone_enc) FROM phn WHERE phn_phone_enc LIKE encrypt('123')||'%'. Here as well it is important that encrypt(text) is defined stable or immutable.

1. The same input data combined with data from the DB (as it is visible to the transaction) always yields the same result.
2. The same input data always yields the same result.
3. There is no correlation between the input data and the result.

Alban Hertroys

--
Screwing up is the best way to attach something to the ceiling.

!DSPAM:737,4affebf911071302014309!

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Merlin Moncure 2009-11-15 14:29:24 Re: Fast Search on Encrypted Feild
Previous Message Jorge Godoy 2009-11-15 11:50:18 Re: Experience with many schemas vs many databases