RE: Why the index is not used ?

From: ROS Didier <didier(dot)ros(at)edf(dot)fr>
To: "tomas(dot)vondra(at)2ndquadrant(dot)com" <tomas(dot)vondra(at)2ndquadrant(dot)com>, "folarte(at)peoplecall(dot)com" <folarte(at)peoplecall(dot)com>
Cc: "pavel(dot)stehule(at)gmail(dot)com" <pavel(dot)stehule(at)gmail(dot)com>, "pgsql-sql(at)lists(dot)postgresql(dot)org" <pgsql-sql(at)lists(dot)postgresql(dot)org>, "pgsql-performance(at)lists(dot)postgresql(dot)org" <pgsql-performance(at)lists(dot)postgresql(dot)org>, "pgsql-general(at)lists(dot)postgresql(dot)org" <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: RE: Why the index is not used ?
Date: 2018-10-08 14:10:41
Message-ID: 28893ac2b3df41a89ba4266bad6e43ce@PCYINTPEXMU001.NEOPROD.EDF.FR
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-performance pgsql-sql

Hi Tomas

Thank you for your answer and recommendation which is very interesting. I'm going to study the PCI DSS document right now.
- Here are my answer to your question :
>>
What is your threat model?
<<
we want to prevent access to sensitive data for everyone except those who have the encryption key.
in case of files theft, backups theft, dumps theft, we do not want anyone to access sensitive data.

- I have tested the solution you proposed, it works great.

Best Regards

Didier ROS
-----Message d'origine-----
De : tomas(dot)vondra(at)2ndquadrant(dot)com [mailto:tomas(dot)vondra(at)2ndquadrant(dot)com]
Envoyé : dimanche 7 octobre 2018 22:08
À : ROS Didier <didier(dot)ros(at)edf(dot)fr>; folarte(at)peoplecall(dot)com
Cc : pavel(dot)stehule(at)gmail(dot)com; pgsql-sql(at)lists(dot)postgresql(dot)org; pgsql-performance(at)lists(dot)postgresql(dot)org; pgsql-general(at)lists(dot)postgresql(dot)org
Objet : Re: Why the index is not used ?

Hi,

On 10/07/2018 08:32 PM, ROS Didier wrote:
> Hi Francisco
>
> Thank you for your remark.
> You're right, but it's the only procedure I found to make search on
> encrypted fields with good response times (using index) !
>

Unfortunately, that kinda invalidates the whole purpose of in-database encryption - you'll have encrypted on-disk data in one place, and then plaintext right next to it. If you're dealing with credit card numbers, then you presumably care about PCI DSS, and this is likely a direct violation of that.

> Regarding access to the file system, our servers are in protected
network areas. few people can connect to it.
>

Then why do you need encryption at all? If you assume access to the filesystem / storage is protected, why do you bother with encryption?
What is your threat model?

> it's not the best solution, but we have data encryption needs and good
> performance needs too. I do not know how to do it except the specified
> procedure..
>
> if anyone has any proposals to put this in place, I'm interested.
>

One thing you could do is hashing the value and then searching by the hash. So aside from having the encrypted column you'll also have a short hash, and you may use it in the query *together* with the original condition. It does not need to be unique (in fact it should not be to make it impossible to reverse the hash), but it needs to have enough distinct values to make the index efficient. Say, 10k values should be enough, because that means 0.01% selectivity.

So the function might look like this, for example:

CREATE FUNCTION cchash(text) RETURNS int AS $$
SELECT abs(hashtext($1)) % 10000;
$$ LANGUAGE sql;

and then be used like this:

CREATE INDEX idx_cartedecredit_cc02 ON cartedecredit(cchash(cc));

and in the query

SELECT pgp_sym_decrypt(cc, 'motdepasse') FROM cartedecredit
WHERE pgp_sym_decrypt(cc, 'motdepasse')='test value 32'
AND cchash(cc) = cchash('test value 32');

Obviously, this does not really solve the issues with having to pass the password to the query, making it visible in pg_stat_activity, various logs etc.

Which is why people generally use FDE for the whole disk, which is transparent and provides the same level of protection.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Ce message et toutes les pièces jointes (ci-après le 'Message') sont établis à l'intention exclusive des destinataires et les informations qui y figurent sont strictement confidentielles. Toute utilisation de ce Message non conforme à sa destination, toute diffusion ou toute publication totale ou partielle, est interdite sauf autorisation expresse.

Si vous n'êtes pas le destinataire de ce Message, il vous est interdit de le copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou partie. Si vous avez reçu ce Message par erreur, merci de le supprimer de votre système, ainsi que toutes ses copies, et de n'en garder aucune trace sur quelque support que ce soit. Nous vous remercions également d'en avertir immédiatement l'expéditeur par retour du message.

Il est impossible de garantir que les communications par messagerie électronique arrivent en temps utile, sont sécurisées ou dénuées de toute erreur ou virus.
____________________________________________________

This message and any attachments (the 'Message') are intended solely for the addressees. The information contained in this Message is confidential. Any use of information contained in this Message not in accord with its purpose, any dissemination or disclosure, either whole or partial, is prohibited except formal approval.

If you are not the addressee, you may not copy, forward, disclose or use any part of it. If you have received this message in error, please delete it and all copies from your system and notify the sender immediately by return message.

E-mail communication cannot be guaranteed to be timely secure, error or virus-free.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message ROS Didier 2018-10-08 14:29:26 RE: Why the index is not used ?
Previous Message Phil Endecott 2018-10-08 13:14:45 RE: Why the index is not used ?

Browse pgsql-performance by date

  From Date Subject
Next Message ROS Didier 2018-10-08 14:29:26 RE: Why the index is not used ?
Previous Message Phil Endecott 2018-10-08 13:14:45 RE: Why the index is not used ?

Browse pgsql-sql by date

  From Date Subject
Next Message ROS Didier 2018-10-08 14:29:26 RE: Why the index is not used ?
Previous Message Phil Endecott 2018-10-08 13:14:45 RE: Why the index is not used ?