Re: leaky views, yet again

From: KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, KaiGai Kohei <kaigai(at)kaigai(dot)gr(dot)jp>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: leaky views, yet again
Date: 2010-09-02 07:33:52
Message-ID: 4C7F5360.6080008@ak.jp.nec.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

(2010/09/02 13:30), KaiGai Kohei wrote:
> (2010/09/02 12:38), Robert Haas wrote:
>> 2010/9/1 KaiGai Kohei<kaigai(at)ak(dot)jp(dot)nec(dot)com>:
>>> (2010/09/02 11:57), Robert Haas wrote:
>>>> 2010/9/1 KaiGai Kohei<kaigai(at)ak(dot)jp(dot)nec(dot)com>:
>>>>> Right now, it stands on a strict assumption that considers operators
>>>>> implemented with built-in functions are safe; it does not have no
>>>>> possibility to leak supplied arguments anywhere.
>>>>>
>>>>> Please note that this patch does not case about a case when
>>>>> a function inside a view and a function outside a view are
>>>>> distributed into same level and the later function has lower
>>>>> cost value.
>>>>
>>>> Without making some attempt to address these two points, I don't see
>>>> the point of this patch.
>>>>
>>>> Also, I believe we decided previously do this deoptimization only in
>>>> case the user requests it with CREATE SECURITY VIEW.
>>>>
>>> Perhaps, I remember the previous discussion incorrectly.
>>>
>>> If we have a hint about whether the supplied view is intended to security
>>> purpose, or not, it seems to me it is a reliable method to prevent pulling
>>> up the subqueries come from security views.
>>> Is it too much deoptimization?
>>
>> Well, that'd prevent something like id = 3 from getting pushed down,
>> which seems a bit harsh.
>>

I've tried to implement a proof of the concept patch according to the
following logic.
(For the quick hack, it does not include statement support. It just
considers view with name begun from "s" as security views.)

> Hmm. If so, we need to remember what FromExpr was come from subqueries
> of security views, and what were not. Then, we need to prevent leakable
> clause will be distributed to inside of them.
> In addition, we also need to care about the order of function calls in
> same level, because it is not implicitly solved.
>
> At first, let's consider top-half of the matter.
>
> When views are expanded into subqueries in query-rewriter, Query tree
> lost an information OID of the view, because RangeTblEntry does not have
> OID of the relation when it is RTE_SUBQUERY. So, we need to patch here
> to mark a flag whether the supplied view is security focused, or not.
>
This patch added 'security_view' flag into RangeTblEntry.
It shall be set when the query rewriter expands security views.

> Then, pull_up_simple_subquery() pulls up a supplied subquery into normal
> join, if possible. In this case, FromExpr is chained into the upper level.
> Of course, FromExpr does not have a flag to show its origin, so we also
> need to copy the new flag in RangeTblEntry to FromExpr.
>
This patch also added 'security_view' flag into FromExpr.
It shall be set when the pull_up_simple_subquery() pulled up
a RangeTblEntry with security_view = true.

> Then, when distribute_qual_to_rels() is called, the caller also provides
> a Bitmapset of relation-Ids which are contained under the FromExpr with
> the flag saying it came from the security views.
> Even if the supplied clause references a part of the Bitmapset, we need
> to prevent the clause being pushed down into the relations came from
> security views, except for ones we can make sure these are safe.
>
Just before distribute_qual_to_rels(), deconstruct_recurse() is invoked.
It walks on the supplied join-tree to collect what relations are appeared
under the current FromExpr/JoinExpr.
The deconstruct_recurse() was modified to take two new arguments of
'bool below_sec_barriers' and 'Relids *sec_barriers'.
The first one means the current recursion is under the FromExpr with
security_view being true. At that time, it set appeared relations on
the sec_barriers, then returns.
In the result, the 'sec_barriers' shall become a bitmapset of relations
being under the FromExpr which is originated by security views.

Then, 'sec_barriers' shall be delivered to distribute_qual_to_rels().
If the supplied qualifier references a part of 'sec_barriers' and
contains possibly leakable functions, it appends whole of the sec_barriers
to the bitmapset of relations on which the clause is depending.
In the result, it shall not be pushed down into the security view.

Example)
testdb=# CREATE VIEW n_view AS SELECT * FROM t1 JOIN t2 ON t1.a = t2.x;
CREATE VIEW
testdb=# CREATE VIEW s_view AS SELECT * FROM t1 JOIN t2 ON t1.a = t2.x;
CREATE VIEW

testdb=# EXPLAIN SELECT * FROM n_view WHERE f_malicious(y);
QUERY PLAN
-------------------------------------------------------------------
Hash Join (cost=334.93..365.94 rows=410 width=72)
Hash Cond: (t1.a = t2.x)
-> Seq Scan on t1 (cost=0.00..22.30 rows=1230 width=36)
-> Hash (cost=329.80..329.80 rows=410 width=36)
-> Seq Scan on t2 (cost=0.00..329.80 rows=410 width=36)
Filter: f_malicious(y)
(6 rows)

testdb=# EXPLAIN SELECT * FROM s_view WHERE f_malicious(y);
QUERY PLAN
-------------------------------------------------------------------
Hash Join (cost=37.68..384.39 rows=410 width=72)
Hash Cond: (t1.a = t2.x)
Join Filter: f_malicious(t2.y)
-> Seq Scan on t1 (cost=0.00..22.30 rows=1230 width=36)
-> Hash (cost=22.30..22.30 rows=1230 width=36)
-> Seq Scan on t2 (cost=0.00..22.30 rows=1230 width=36)
(6 rows)

==> f_malicious() was moved to outside of the join loop

testdb=# EXPLAIN SELECT * FROM n_view WHERE f_malicious(y) AND a = 2;
QUERY PLAN
-------------------------------------------------------------------------
Nested Loop (cost=0.00..16.80 rows=1 width=72)
-> Index Scan using t1_pkey on t1 (cost=0.00..8.27 rows=1 width=36)
Index Cond: (a = 2)
-> Index Scan using t2_pkey on t2 (cost=0.00..8.52 rows=1 width=36)
Index Cond: (x = 2)
Filter: f_malicious(y)
(6 rows)

testdb=# EXPLAIN SELECT * FROM s_view WHERE f_malicious(y) AND a = 2;
QUERY PLAN
-------------------------------------------------------------------------
Nested Loop (cost=0.00..16.80 rows=1 width=72)
Join Filter: f_malicious(t2.y)
-> Index Scan using t1_pkey on t1 (cost=0.00..8.27 rows=1 width=36)
Index Cond: (a = 2)
-> Index Scan using t2_pkey on t2 (cost=0.00..8.27 rows=1 width=36)
Index Cond: (x = 2)
(6 rows)
==> Because 'a = 2' is not leakable, this clause was pushed down into
the join loop. But f_malicious() is not in 's_view' example.

testdb=# CREATE VIEW n_view2 AS SELECT * FROM t1 WHERE f_policy(a);
CREATE VIEW
testdb=# CREATE VIEW s_view2 AS SELECT * FROM t1 WHERE f_policy(a);
CREATE VIEW
testdb=# EXPLAIN SELECT * FROM n_view2 WHERE f_malicious(b);
QUERY PLAN
-------------------------------------------------------
Seq Scan on t1 (cost=0.00..329.80 rows=137 width=36)
Filter: (f_malicious(b) AND f_policy(a))
(2 rows)

testdb=# EXPLAIN SELECT * FROM s_view2 WHERE f_malicious(b);
QUERY PLAN
-------------------------------------------------------
Seq Scan on t1 (cost=0.00..329.80 rows=137 width=36)
Filter: (f_malicious(b) AND f_policy(a))
(2 rows)

==> But it does not affect anything on the case when a leakable
function from outside of the view is chained to same level
with security policy function of the view.
In this case, we need to mark functions the original nest
level, then sort by the nest level rather than cost on the
order_qual_clauses().

Thanks,
--
KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>

Attachment Content-Type Size
pgsql-fix-leaky-join-view.5.patch text/x-patch 23.8 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Magnus Hagander 2010-09-02 08:01:09 Re: git: uh-oh
Previous Message Jeff Davis 2010-09-02 05:48:37 Re: compiling with RELCACHE_FORCE_RELEASE doesn't pass regression