[v9.3] Row-Level Security

From: Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
To: Florian Pflug <fgp(at)phlo(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: PgHacker <pgsql-hackers(at)postgresql(dot)org>
Subject: [v9.3] Row-Level Security
Date: 2012-06-14 15:43:39
Message-ID: CADyhKSWGtZqpsXtF7_q2FvKRvX6RqW+xv7VmxNmj4gubSBoo-g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

The attached patch provides bare row-level security feature.

Table's owner can set its own security policy using the following syntax:
ALTER TABLE <table> SET ROW LEVEL SECURITY (<condition>);
ALTER TABLE <table> RESET ROW LEVEL SECURITY;

The condition must be an expression that returns a boolean value
and can reference contents of each rows. (Right now, it does not
support to contain SubLink in the expression; to be improved)

In the previous discussion, we planned to add a syntax option to
clarify the command type to fire the RLS policy, such as FOR UPDATE.
But current my opinion is, it is not necessary. For example, we can
reference the contents of rows being updated / deleted using
RETURNING clause. So, it does not make sense to have different
RLS policy at UPDATE / DELETE from SELECT.

If and when user's query (SELECT, UPDATE or DELETE, not INSERT)
references the relation with RLS policy, only rows that satisfies the
supplied condition are available to access.
It performs as if the configured condition was implicitly added to
the WHERE clause, however, this mechanism tries to replace
references to the table with RLS policy by a simple sub-query
that scans the target table with RLS policy, to ensure the policy
condition is evaluated earlier than any other user given qualifier.

EXPLAIN shows how RLS works.

postgres=# ALTER TABLE sample SET ROW LEVEL SECURITY (z > current_date - 10);
ALTER TABLE

postgres=# EXPLAIN SELECT * FROM sample WHERE f_leak(y);
QUERY PLAN
------------------------------------------------------------------------------------
Subquery Scan on sample (cost=0.00..42.54 rows=215 width=40)
Filter: f_leak(sample.y)
-> Seq Scan on sample (cost=0.00..36.10 rows=644 width=66)
Filter: ((z > (('now'::cstring)::date - 10)) OR
has_superuser_privilege())
(4 rows)

In above example, the security policy does not allow to reference
rows earlier than 10 days. Then, SELECT on the table was
expanded to a sub-query and configured expression was added
inside of the sub-query. Database superuser can bypass any
security checks, so "OR has_superuser_privilege()" was automatically
attached in addition to user configured expression.

On the other hand, I'm not 100% sure about my design to restrict
rows to be updated and deleted. Similarly, it expands the target
relation of UPDATE or DELETE statement into a sub-query with
condition. ExecModifyTable() pulls a tuple from the sub-query,
instead of regular table, so it seems to me working at least, but
I didn't try all the possible cases of course.

postgres=# EXPLAIN UPDATE sample SET y = y || '_updt' WHERE f_leak(y);
QUERY PLAN
------------------------------------------------------------------------------------------
Update on sample (cost=0.00..43.08 rows=215 width=46)
-> Subquery Scan on sample (cost=0.00..43.08 rows=215 width=46)
Filter: f_leak(sample.y)
-> Seq Scan on sample (cost=0.00..36.10 rows=644 width=66)
Filter: ((z > (('now'::cstring)::date - 10)) OR
has_superuser_privilege())
(5 rows)

I have two other ideas to implement writer side RLS.

The first idea modifies WHERE clause to satisfies RLS policy, but Robert
concerned about this idea in the previous discussion, because it takes
twice scans.
UPDATE sample SET y = y || '_updt' WHERE f_leak(y);
shall be modified to:
UPDATE sample SET y = y || '_updt' WHERE ctid = (
SELECT ctid FROM (
SELECT ctid, * FROM sample WHERE <RLS policy>
) AS pseudo_sample WHERE f_leak(y)
);
Although the outer scan is ctid scan, it takes seq-scan at first level.

The second idea tries to duplicate RangeTblEntry of the target relation
to be updated or deleted, then one perform as target relation as is, and
the other performs as data source to produce older version of tuples;
being replaced by a sub-query with RLS condition.
I didn't try the second idea yet. As long as we can patch the code that
assumes the target relation has same rtindex with source relation, it
might be safe approach. However, I'm not sure which is less invasive
approach compared to the current patch.

Of course, here is some limitations, to keep the patch size reasonable
level to review.
- The permission to bypass RLS policy was under discussion.
If and when we should provide a special permission to bypass RLS
policy, the "OR has_superuser_privilege()" shall be replaced by
"OR has_table_privilege(tableoid, 'RLSBYPASS')".
Right now, I allows only superuser to bypass RLS policy.
- This patch focuses on the bare feature only, not any enhancement
at query optimization feature, so RLS policy might prevent index-scan,
right now.
- RLS policy is not applied to the row to be inserted, or newer version
of row to be updated. It can be implemented using before-row trigger.
It might be an idea to inject RLS trigger function automatically, like
FK constraints, but not yet.
- As Florian pointed out, current_user may change during query
execution if DECLARE and FETCH are used.
Although it is not a matter in RLS itself, should be improved later.

Thanks,
--
KaiGai Kohei <kaigai(at)kaigai(dot)gr(dot)jp>

Attachment Content-Type Size
pgsql-v9.3-row-level-security.v1.patch application/octet-stream 75.9 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dave Page 2012-06-14 15:59:26 Re: Minimising windows installer password confusion
Previous Message Amit Kapila 2012-06-14 15:39:07 Allow WAL information to recover corrupted pg_controldata