[RFC] A tackle to the leaky VIEWs for RLS

From: KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Cc: heikki(dot)linnakangas(at)enterprisedb(dot)com, robertmhaas(at)gmail(dot)com, sfrost(at)snowman(dot)net
Subject: [RFC] A tackle to the leaky VIEWs for RLS
Date: 2010-06-01 08:39:41
Message-ID: 4C04C74D.8070409@ak.jp.nec.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

As it was reported before, we have an open item about leaky VIEWs for RLS.

On the talk at Ottawa, Robert suggested me to post my idea prior to submit
a patch. So, I'd like to explain my idea at first.
Actually I'm not familiar to optimizar details, so it needs any helps from
experts of optimizar.

The problem was ...

* Using views for row-level access control is leaky
http://archives.postgresql.org/pgsql-hackers/2009-10/msg01346.php

Even if a table is unvisible from certain users without views that filter
a part of tuples, it can leak to users as long as they can define their own
functions.

It seems to me the problem can be divided into two major parts.

See the following sample tables, views and functions.

postgres=# CREATE TABLE t1 (a int primary key, b text);
NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "t1_pkey" for table "t1"
CREATE TABLE
postgres=# CREATE TABLE t2 (x int primary key, y text);
NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "t2_pkey" for table "t2"
CREATE TABLE
postgres=# INSERT INTO t1 VALUES (1, 'aaa'), (2, 'bbb'), (3, 'ccc');
INSERT 0 3
postgres=# INSERT INTO t2 VALUES (1, 'xxx'), (2, 'yyy'), (3, 'zzz');
INSERT 0 3

-- We assume the security policy function needs the given integer key
-- is odd number to be visible for users.
--
postgres=# CREATE OR REPLACE FUNCTION f_policy(int) RETURNS bool
AS 'BEGIN RETURN $1 % 2 = 1; END' LANGUAGE plpgsql;
CREATE FUNCTION

-- We assume a malicious user defined function raises a notice with
-- given arguments. It may be possible to insert it other temp tables.
--
postgres=# CREATE OR REPLACE FUNCTION f_malicious(text) RETURNS bool COST 0.0001
AS 'BEGIN RAISE NOTICE ''f_malicious: %'', $1; RETURN true; END;' LANGUAGE plpgsql;
CREATE FUNCTION

[1] The order of scan filters to be evaluated
----------------------------------------------
The first problem is an inversion of evaluation of scan filters.

postgres=# CREATE OR REPLACE VIEW v1 AS SELECT * FROM t1 WHERE f_policy(a);
CREATE VIEW

postgres=# EXPLAIN SELECT * FROM v1 WHERE f_malicious(b);
QUERY PLAN
-------------------------------------------------------
Seq Scan on t1 (cost=0.00..329.80 rows=137 width=36)
Filter: (f_malicious(b) AND f_policy(a))
(2 rows)

postgres=# SELECT * FROM v1 WHERE f_malicious(b);
NOTICE: f_malicious: aaa
NOTICE: f_malicious: bbb <-- leaky contents
NOTICE: f_malicious: ccc
a | b
---+-----
1 | aaa
3 | ccc
(2 rows)

In this case, owner of the view expects tuples within t1 shall be filtered
by f_policy() functions, so tuples with even-number shall be invisible.
However, the optimizar reorders evaluation of scan filters based on the cost
parameter of functions and others, then f_malicious() was invoked prior to
f_policy(). It is a right approach, if functions are not malicious.
But user may define a malicious purpose function.

The given query is internally rewritten, then subquery will be pulled up
in the optimizar logic.

SELECT * FROM v1 WHERE f_malicious(b);
-> SELECT * FROM (SELECT * FROM t1 WHERE f_policy(a)) v1 WHERE f_malicious(b)
-> SELECT * FROM t1 WHERE f_policy(a) AND f_malicious(b)

During we create a scan plan, the order_qual_clauses() computes the best
order to evaluate the given WHERE clause based on the cost estimation.
In this case, f_malicious() has very small cost, so order_qual_clauses()
decides the f_malicious() should be invoked earlier than f_policy().
In the result, ExecScan() invokes f_malicious() with contents of scanned
tuples to be invisible.

I have an idea that we add FuncExpr a new field (e.g nestlevel) to remember
where is originally put in the query, and prevent reordering over the nest
level of subqueries.
In above example, f_malicious() has nestlevel=0 because it is put on the top
level.
But f_policy() has nestlevel=1 because it is originally put on the second
level subquery. Then, the order_qual_clauses() will check nestlevel of the
scan filter prior to reorder them based on the cost estimation.
Even if we have multiple nestlevels, solution will be same. A FuncExpr with
larger nestlevel shall be invoked earlier than others.

Please note that we only focus on user defined functions.
For example, it is worth to choose index-scans instead of seq-scans, when
a user provides conditions which can be indexed, as follows:

SELECT * FROM v1 WHERE a = 100;
-> SELECT * FROM (SELECT * FROM t1 WHERE f_policy(a)) v1 WHERE a = 100;

In this case, we should scan the t1 using index with the condition of 'a = 100'
prior to evaluation of f_policy(). Any operators eventually invokes a function
being correctly installed, but an assumption is that we can trust operators,
index access method, type input/output methods, conversions and so on, because
these features have to be installed by DBA (or initdb).

[2] Unexpected distribution of scan filter
-------------------------------------------

Here is one other situation of leaky VIEWs for RLS.

postgres=# CREATE OR REPLACE VIEW v2 AS
SELECT * FROM t1 JOIN t2 ON a = x WHERE f_policy(a);
CREATE VIEW
postgres=# SELECT * FROM v2;
a | b | x | y
---+-----+---+-----
1 | aaa | 1 | xxx
3 | ccc | 3 | zzz
(2 rows)

This view intends to provide a joined virtual relation with a restriction using
f_policy(). In fact, it filters out tuples with even-number key.
However, we can leak information of the filtered tuples with different scenarios.

postgres=# SELECT * FROM v2 WHERE f_malicious(y);
NOTICE: f_malicious: xxx
NOTICE: f_malicious: yyy <-- leaky contents
NOTICE: f_malicious: zzz
a | b | x | y
---+-----+---+-----
1 | aaa | 1 | xxx
3 | ccc | 3 | zzz
(2 rows)

postgres=# EXPLAIN SELECT * FROM v2 WHERE f_malicious(y);
QUERY PLAN
-------------------------------------------------------------------------
Nested Loop (cost=0.00..287.63 rows=410 width=72)
-> Seq Scan on t2 (cost=0.00..22.30 rows=410 width=36)
Filter: f_malicious(y)
-> Index Scan using t1_pkey on t1 (cost=0.00..0.63 rows=1 width=36)
Index Cond: (t1.a = t2.x)
Filter: f_policy(t1.a)
(6 rows)

We can see the f_malicious() was distributed to the Seq-Scan on t2, not the
Nested-Loop, because its arguments only depends on the t2, so the optimizar
tries to distribute the scan filter into the least unit of scan.

IIUC, the distribute_qual_to_rels() determines what scan-plan should have
what scan-filters based on dependency of the function call.

For example, the second qualifier depends on both of t1 and t2, it was
distributed to outside of the join.
postgres=# EXPLAIN SELECT * FROM v2 WHERE f_malicious(y) and b || y != 'aaaxxx';
QUERY PLAN
-------------------------------------------------------------------------
Nested Loop (cost=0.00..289.68 rows=408 width=72)
Join Filter: ((t1.b || t2.y) <> 'aaaxxx'::text)
-> Seq Scan on t2 (cost=0.00..22.30 rows=410 width=36)
Filter: f_malicious(y)
-> Index Scan using t1_pkey on t1 (cost=0.00..0.63 rows=1 width=36)
Index Cond: (t1.a = t2.x)
Filter: f_policy(t1.a)
(7 rows)

My idea is similar to what I proposed at [1]. It adds a new field into
RelOptInfo (or other structure?) to remember the original nestlevel of
the scan, then it will be compared to nestlevel of the FuncExpr.
If nestlevel of the FuncExpr is smaller than nestlevel of the RelOptInfo,
it prevents to distribute the FuncExpr onto the RelOptInfo, even if the
function depends on only the relation of RelOptInfo.

The way to handle trusted nodes are same as [1]. If user given operators
depend on only one-side of join, this idea does not prevent anything.

[3] Issues of the approach
---------------------------
Can we find out any other scenario that malicious user defined function
allows us to leak invisible tuples to be filtered out.
Of course, we can never prove software being bug-free. What I want to say
is that "Please point out, if I'm missing something significant scenario".

How much performance impact? Is it reasonable, or not?
I'm not sure whether it is really worse in performance, or not, because
it affects only user defined functions, not operators. So, it seems to
me the idea does not prevent plans to boost using index-scan.

It seems to me the idea on [1] does not make significant regressions,
because it does not change scale of the plan. But the idea on [2] may
affect to scale of the plan, if user defined function filters most of
tuples within one-side tables of the join.

IIRC, I was suggested to distinguish VIEWs with/without security purpose.
It also seems to me an option. If a certain VIEW has its priority on row
level security, it seems to me reasonable to disable a part of optimization
which I introduced above.

Sorry for the long description.
Please any comments.
--
KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2010-06-01 09:00:23 Re: BUG #5487: dblink failed with 63 bytes connection names
Previous Message Mohammad Heykal Abdillah 2010-06-01 07:49:53 What the function name to get the contents table/tuple ?