Re: Column Redaction

From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Damian Wolgast <damian(dot)wolgast(at)si-co(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Column Redaction
Date: 2014-10-10 12:25:06
Message-ID: 5437D022.1080408@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 10/10/2014 02:27 PM, Stephen Frost wrote:
> * Heikki Linnakangas (hlinnakangas(at)vmware(dot)com) wrote:
>> On 10/10/2014 02:05 PM, Stephen Frost wrote:
>>> * Heikki Linnakangas (hlinnakangas(at)vmware(dot)com) wrote:
>>>> On 10/10/2014 01:35 PM, Stephen Frost wrote:
>>>>> Regarding functions, 'leakproof' functions should be alright to allow,
>>>>> though Heikki brings up a good point regarding binary search being
>>>>> possible in a plpgsql function (or even directly by a client). Of
>>>>> course, that approach also requires that you have a specific item in
>>>>> mind.
>>>>
>>>> It doesn't require that you have a specific item in mind. Binary
>>>> search is cheap, O(log n). It's easy to write a function to do a
>>>> binary search on a single item, passed as argument, and then apply
>>>> that to all rows:
>>>>
>>>> SELECT binary_search_reveal(cardnumber) FROM redacted_table;
>>>
>>> Note that your binary_search_reveal wouldn't be marked as leakproof and
>>> therefore this wouldn't be allowed. If this was allowed, you'd simply
>>> do "raise notice" inside the function and call it a day.
>>
>> *shrug*, just do the same with a more complicated query, then. Even
>> if you can't create a function that does that, you can still execute
>> the same logic without a function.
>
> Not sure I see what you're getting at here..? My point was that you'd
> need a target number and the system would only provide confirmation that
> the number exists, or does not. Your argument was that the table
> itself would provide the target number, which was flawed. I don't see
> how "just do the same with a more complicated query" removes the need to
> have a target number for the binary search.

You said above that it's OK to pass the card numbers to leakproof
functions. But if you allow that, you can write a function that takes as
argument a redacted card number, and unredacts it (using the < and =
operators in a binary search). And then you can just do "SELECT
unredact(card_number) from redacted_table".

You seem to have something stronger in mind: only allow the equality
operator on the redacted column, and nothing else. That might be better,
although I'm not really convinced. There are just too many ways you
could still leak the datum. Just a random example, inspired by the
recent CRIME attack on SSL: build a row with the redacted datum, and
another "guess" datum, and store it along with 1k of other data in a
temporary table. The row gets toasted. Observe how much it compressed;
if the guess datum is close to the original datum, it compresses well.
Now, you can probably stop that particular attack with more restrictions
on what you can do with the datum, but that just shows that pretty much
any computation you allow with the datum can be used to reveal its value.

- Heikki

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thom Brown 2014-10-10 12:28:01 Re: Column Redaction
Previous Message Tomas Vondra 2014-10-10 12:10:13 Re: Yet another abort-early plan disaster on 9.3