Re: Row-Level Security

From: KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: KaiGai Kohei <kaigai(at)kaigai(dot)gr(dot)jp>, Josh Berkus <josh(at)agliodbs(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Row-Level Security
Date: 2009-12-14 04:57:22
Message-ID: 4B25C5B2.9010108@ak.jp.nec.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert Haas wrote:
> On Sun, Dec 13, 2009 at 3:50 AM, KaiGai Kohei <kaigai(at)kaigai(dot)gr(dot)jp> wrote:
>> Basically, right. In my branch, SE-PgSQL put its hook after all the BR
>> trigger invocations.
>>
>> http://code.google.com/p/sepgsql/source/browse/branches/pgsql-8.4.x/sepgsql/src/backend/executor/execMain.c#1883
>>
>> But we have another approach. When RelationBuildTriggers() initializes
>> TriggerDesc of Relation, we can inject security hook as a special BR-trigger
>> at the last. If we initialize it here, we don't need to modify COPY FROM
>> implementation, not only INSERT.
>>
>> The reason why I didn't apply this approach is it needs more modification
>> to the core routines, so it makes harder to manage out-of-tree code.
>
> That's definitely something to consider if it's true. Why did it
> require more modification of the core routines?

In my local branch, it just adds two lines as follows:
+ /* SELinux labeling and permission checks */
+ sepgsql_heap_insert(resultRelationDesc, tuple);

It is obviously less than modify RelationBuildTriggers() to allocate an
additional slot for the TrigDesc array and put an entry.
The reason was just from the perspective to maintain out-of-tree code,
but different perspective will be necessary to propose a featuer to upstream.

>> One point. MAC is "mandatory", so the table owner should not be able to
>> control whether row-level checks are applied, or not.
>> So, I used a special purpose system column to represent security label.
>> It is generated for each tables, and no additional storage consumption
>> when MAC feature is disabled.
>
> My current feeling is that a special-purpose system column is not the
> best approach. I don't see what we gain by doing it that way. Even
> in an SE-PostgreSQL environment, row-level security might not be
> desired on every table - after all, we've been told that SE-PostgreSQL
> is useful without any row-level security AT ALL, so it's not hard to
> think there could be environments where only some tables need to
> protected. So I think we want to have a way to turn it on and off on
> a per-table basis.
>
> Of course, as you point out, we have to make sure that anyone who
> tries to turn RLS on or off for a particular table is authorized to
> perform that operation. But that's a separate problem which is I
> don't think has much to do with row-level security.

Yes, it is a separate problem not to be concluded at the moment.
(Perhaps, it depends on security model. In DAC, per-table basis is preferable.)

So, I'd like to bring up just an issue to be discussed later.
When we build a binary with a label-based MAC, such as SE-PgSQL, it shall
be turned on/off in the startup time.
(I don't assume it should be configurable in runtime.)

If we set up database cluster without any label-based MAC, all the tuple
shall not have any security label. If the security label is stored within
regular column, we have to modify schema for any tables at first.
If system column provides a security label of tuple, we can dynamically
generate an appropriate security label. In SELinux case, it assumes any
unlabeled objects performs as if it has a pseudo security label:
system_u:object_r:unlabeled_t:s0

Needless to say, we need to assign appropriate security labels for
meaningful access controls later, but it does not require any schema
changes, even if we repeat to turn on/off the label-based MAC feature.

When label-based MAC feature is disabled, this system column can return
a pseudo value such as NULL or empty string.

>>> #3 seems a little bit trickier. I don't think the GRANT ... WHERE
>>> syntax is going to be very easy to use. For constraint-based
>>> row-security, I think we should have something more like:
>>>
>>> ALTER TABLE table ADD ROW FILTER filtername USING othertable [, ...]
>>> WHERE where-clause
>>>
>>> (This suffers from the same problem as DELETE ... USING, namely that
>>> sometimes you want an outer join between table and othertable.)
>>>
>>> This gives the user a convenient way to insert a join against one or
>>> more side tables if they are so inclined.
>> Is it reasonably possible to implement USING clause, even if row-level
>> security is applied on COPY FROM/TO statement?
>> And, isn't it necessary to specify condition to apply the filter?
>> (such as select, update and delete)
>
> The filter is the WHERE clause. I would think that the operation
> being performed (select, update, delete) wouldn't enter into it. This
> part is just to decide which tuples will actually be accessible AT
> ALL. If you want to further prevent certain tuples that are being
> accessed from being update or deleted, you can use a trigger for that
> (possibly one of the global, always-applied-last triggers discussed
> above).
>
> For INSERT and COPY, I don't think that the ALTER TABLE ... ADD ROW
> FILTER stuff would apply. If you want to restrict what gets inserted,
> that's another job for triggers.

Are you talking about COPY TO, not only COPY FROM?
For INSERT and COPY FROM, I agree with the direction. Access controls
(and labeling) should be applied on the BR trigger functions.

But COPY TO should filter violated tuples in proper way, because it
can be a big bypass for row-level access controls.
If WHERE clause does not refer any other relations, it is not a difficult
to handle correctly.

>>> For security frameworks like SE-PostgreSQL, we might just provide a
>>> hook allowing the incoming query tree to be modified, and let the hook
>>> function check whether each table in the query has row-level security
>>> enabled, and if so perform a modification equivalent to the above.
>> One point we have to pay mention is all the row-level filter conditions
>> have to be evaluated before all the user given condition, except for
>> operators pulled-up to index accesses.
>> It allows malicious row-cost functions to leak "invisible" tuples anywhere.
>
> We currently have this problem with DAC as well - it means that VIEWs
> don't actually work as a security gateway, if the user has the ability
> to define a function and pass a WHERE clause to a query against the
> view, they can extract the hidden rows. Fixing it seems like a hard
> problem.

Yes, we need to consider reasonable solution for the matter.

>> * TRUNCATE statement
>>
>> Truncate is a good feature to clean up the contents of a table.
>> But it may contain unremovable tuples. So, it needs to scan a table to
>> be truncated once to confirm all the tuples can be removed by the current
>> user. It is a trade-off case between performance and security.
>
> I think we should just disallow TRUNCATE in cases where this might be
> an issue. If you want a slow and painful way to get rid of your table
> contents, use DELETE. Or at least, I'd start by doing it this way and
> then we can think about whether there's enough benefit to doing what
> you're suggesting later.

It seems to me the uniformed-disallow is more painfull than violation checks
on the table to be truncated. At least, we should provide an option to check
the table to be truncated does not contain any unremovable tuples when row-
level checks are activated.

However, as you pointed out, it is not a first issue to be resolved.
It may be a todo feature.

>> * Foreign Key constraint(2)
>>
>> FK is implemented as a trigger which internally uses SELECT/UPDATE/DELETE.
>> If associated tuples are filtered out, it breaks reference integrity.
>> So, we have to apply special care. In SE-PgSQL case, it raises an error
>> instead of filtering during FK checks. And, row-level security hook is
>> called at the last for each tuples, unlike normal cases.
>
> Perfecting referential integrity here seems like a pretty tough
> problem, but it's likely not necessary to solve it in order to get an
> implementation of row-level security that is useful for some purposes.

Is the approach in SE-PgSQL suitable for the issue?
It can prevent to update/delete tuple referenced by invisible tuples.

We have two modes in row-level security.
The first is filtering-mode. It applies security policy function prior
to any other user given conditions, and filters out violated tuples from
the result set.
The second is aborting-mode. It is only used by internal stuff which does
not provide any malicious function in the condition. It applies security
policy function next to all the WHERE clause, and raises an error if the
query tries to refer violated tuples.

Thanks,
--
OSS Platform Development Division, NEC
KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Stark 2009-12-14 05:13:57 Re: WAL Info messages
Previous Message Greg Stark 2009-12-14 04:57:21 Re: Hot Standby, deferred conflict resolution for cleanup records (v2)