Re: Row-Level Security

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: KaiGai Kohei <kaigai(at)kaigai(dot)gr(dot)jp>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Row-Level Security
Date: 2009-12-13 12:11:38
Message-ID: 603c8f070912130411q765c7e9aw371b468229b8e32@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Dec 13, 2009 at 3:50 AM, KaiGai Kohei <kaigai(at)kaigai(dot)gr(dot)jp> wrote:
> (2009/12/13 12:18), Robert Haas wrote:
>> On Sat, Dec 12, 2009 at 7:41 PM, Josh Berkus<josh(at)agliodbs(dot)com>  wrote:
>>> I blogged about this some time ago.  One issue I can see is that I
>>> believe that the RLS which many users want is different from the RLS
>>> which SEPostgres implements.
>>>
>>> Links:
>>>
>>> http://it.toolbox.com/blogs/database-soup/thinking-about-row-level-security-part-1-30732
>>> http://it.toolbox.com/blogs/database-soup/thinking-about-row-level-security-part-2-30757
>>
>> I read these blog entries a while ago but had forgotten about them.
>> They're very good, and summarize a lot of my thinking on this topic as
>> well.  I think that we can design a framework for row-level security
>> which can encompass both constraint-based security and label-based
>> security.  Both seem to me to be be based around doing essentially the
>> following things:
>>
>> 1. Adding columns to the table to store access control information.
>> 2. Populating those columns with additional information (owner, ACL,
>> security label, etc.) which can be used to make access control
>> decisions.
>> 3. Injecting logic into incoming queries which uses the information
>> inserted by (1) to filter out rows to which the access control policy
>> does not wish to allow access.
>>
>> Waving my hands in the air, #1 and #2 seem pretty straightforward.
>> For constraint-based security, one can imagine just adding a column to
>> the table and then adding a BEFORE INSERT OR UPDATE FOR EACH ROW
>> trigger that populates that column.  For label-based MAC, that's not
>> going to be quite sufficient, because the system needs to ensure that
>> the trigger that populates the security column must run last; if it
>> doesn't, some other trigger can come along afterwards and slip in a
>> value that isn't supposed to be there; plus, it might be inconvenient
>> to need to define this trigger for every table that needs RLS.
>
> Right, label-based MAC need its hook being called after all the BR-Insert
> triggers to assign a correct security label, not only access controls.
> I'd like to point out one more thing. When we update tuples, "invisible"
> tuples have to be filtered out before trigger functions.
>
>> However, those problems don't seem insurmountable.  Suppose we provide
>> a hook function that essentially acts like a global BEFORE INSERT OR
>> UPDATE trigger but which fires after all of the regular triggers.
>
> Basically, right. In my branch, SE-PgSQL put its hook after all the BR
> trigger invocations.
>
> http://code.google.com/p/sepgsql/source/browse/branches/pgsql-8.4.x/sepgsql/src/backend/executor/execMain.c#1883
>
> But we have another approach. When RelationBuildTriggers() initializes
> TriggerDesc of Relation, we can inject security hook as a special BR-trigger
> at the last. If we initialize it here, we don't need to modify COPY FROM
> implementation, not only INSERT.
>
> The reason why I didn't apply this approach is it needs more modification
> to the core routines, so it makes harder to manage out-of-tree code.

That's definitely something to consider if it's true. Why did it
require more modification of the core routines?

>> SE-PostgreSQL can gain control at that point and search through the
>> columns of the target relation for a column called, say,
>> sepg_security_label.  If it finds such a column and that column is of
>> the appropriate type, then (1) if an explicit security label is
>> provided, it checks whether the specified label is permissible, (2)
>> otherwise, if the operation is insert, it determines the appropriate
>> default label for the current security context and inserts it, (3)
>> otherwise, it just leaves the current label alone.  This might not be
>> quite the right behavior but the point is whatever behavior you want
>> to have in terms of assigning/disallowing values for that column
>> should be possible to implement here.  The upshot is that if the
>> system administrator creates an sepg_security_label column of the
>> correct type, row-level security will be enabled for that table.
>> Otherwise, it will not.
>
> Basically, right. SE-PgSQL (or others) assign a new tuple either an
> explicitly given or a default security label, then it checks permission
> whether the client can insert a tuple with this label, or not.
>
> One point. MAC is "mandatory", so the table owner should not be able to
> control whether row-level checks are applied, or not.
> So, I used a special purpose system column to represent security label.
> It is generated for each tables, and no additional storage consumption
> when MAC feature is disabled.

My current feeling is that a special-purpose system column is not the
best approach. I don't see what we gain by doing it that way. Even
in an SE-PostgreSQL environment, row-level security might not be
desired on every table - after all, we've been told that SE-PostgreSQL
is useful without any row-level security AT ALL, so it's not hard to
think there could be environments where only some tables need to
protected. So I think we want to have a way to turn it on and off on
a per-table basis.

Of course, as you point out, we have to make sure that anyone who
tries to turn RLS on or off for a particular table is authorized to
perform that operation. But that's a separate problem which is I
don't think has much to do with row-level security.

>> #3 seems a little bit trickier.  I don't think the GRANT ... WHERE
>> syntax is going to be very easy to use.  For constraint-based
>> row-security, I think we should have something more like:
>>
>> ALTER TABLE table ADD ROW FILTER filtername USING othertable [, ...]
>> WHERE where-clause
>>
>> (This suffers from the same problem as DELETE ... USING, namely that
>> sometimes you want an outer join between table and othertable.)
>>
>> This gives the user a convenient way to insert a join against one or
>> more side tables if they are so inclined.
>
> Is it reasonably possible to implement USING clause, even if row-level
> security is applied on COPY FROM/TO statement?
> And, isn't it necessary to specify condition to apply the filter?
> (such as select, update and delete)

The filter is the WHERE clause. I would think that the operation
being performed (select, update, delete) wouldn't enter into it. This
part is just to decide which tuples will actually be accessible AT
ALL. If you want to further prevent certain tuples that are being
accessed from being update or deleted, you can use a trigger for that
(possibly one of the global, always-applied-last triggers discussed
above).

For INSERT and COPY, I don't think that the ALTER TABLE ... ADD ROW
FILTER stuff would apply. If you want to restrict what gets inserted,
that's another job for triggers.

>> For security frameworks like SE-PostgreSQL, we might just provide a
>> hook allowing the incoming query tree to be modified, and let the hook
>> function check whether each table in the query has row-level security
>> enabled, and if so perform a modification equivalent to the above.
>
> One point we have to pay mention is all the row-level filter conditions
> have to be evaluated before all the user given condition, except for
> operators pulled-up to index accesses.
> It allows malicious row-cost functions to leak "invisible" tuples anywhere.

We currently have this problem with DAC as well - it means that VIEWs
don't actually work as a security gateway, if the user has the ability
to define a function and pass a WHERE clause to a query against the
view, they can extract the hidden rows. Fixing it seems like a hard
problem.

>> None of this addresses the issue of doing RLS on system catalogs,
>> which seems like a much harder problem, possibly one that we should
>> just ignore for the first phase of this project.
>
> It is reasonable.
>
> I'd like to point out a few more issues:
>
> * TRUNCATE statement
>
> Truncate is a good feature to clean up the contents of a table.
> But it may contain unremovable tuples. So, it needs to scan a table to
> be truncated once to confirm all the tuples can be removed by the current
> user. It is a trade-off case between performance and security.

I think we should just disallow TRUNCATE in cases where this might be
an issue. If you want a slow and painful way to get rid of your table
contents, use DELETE. Or at least, I'd start by doing it this way and
then we can think about whether there's enough benefit to doing what
you're suggesting later.

> * Foreign Key constraint(1)
>
> I don't think upcoming label-based MAC feature support covert channel issue.
> Even if PK is invisible, we can guess PK exists from FK. In fact, commercial
> database products (such as Oracle Label Security) also does not care about.

While I can't speak for anyone else, I don't have a problem not caring
about this.

> * Foreign Key constraint(2)
>
> FK is implemented as a trigger which internally uses SELECT/UPDATE/DELETE.
> If associated tuples are filtered out, it breaks reference integrity.
> So, we have to apply special care. In SE-PgSQL case, it raises an error
> instead of filtering during FK checks. And, row-level security hook is
> called at the last for each tuples, unlike normal cases.

Perfecting referential integrity here seems like a pretty tough
problem, but it's likely not necessary to solve it in order to get an
implementation of row-level security that is useful for some purposes.

...Robert

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2009-12-13 13:32:17 Re: Adding support for SE-Linux security
Previous Message Magnus Hagander 2009-12-13 11:29:48 Re: Winflex