The attached patches are cut-off version based on the latest Robert's
updates. The "v8.regtest" adds regression test cases on variable
leaky-view scenarios with/without security-barrier property.
The "v8.option-1" add checks around restriction_selectivity, and prevent
to invoke estimator function if Var node touches relation with security-
The "v8.option-2" add checks around examine_simple_variable, and
prevent to reference statistical data, if Var node tries to reference
relation with security-barrier attribute.
(And, it shall be marked as "leakproof")
I initially thought restriction_selectivity called by clause_selectivity is
the best point to add checks, however, I reconsidered it might not be
the origin of this problem.
As long as user-defined functions acquires control on selectivity
estimation of operators, same problems can be re-produced;
if someone tries to reference unrelated data within estimator.
This scenario is normally prevented, because only superuser can define
a function that can bypass permission checks to reference internal data
structures; using "untrusted" procedural-language.
If my conclusion is right, what we should fix up is built-in estimators side,
and we should enforce estimator function being "leakproof", even though
we still allow unprivileged users to define operators.
2011/12/11 Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>:
> 2011/12/10 Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>:
>> 2011/12/9 Robert Haas <robertmhaas(at)gmail(dot)com>:
>>> I feel like there must be some logic in the planner somewhere that is
>>> "looking through" the subquery RTE and figuring out that safe_foo.a is
>>> really the same variable as foo.a, and which therefore feels entitled
>>> to apply !!'s selectivity estimator to foo.a's statistics. If that's
>>> the case, it might be possible to handicap that logic so that when the
>>> security_barrier flag is set, it doesn't do that, and instead treats
>>> safe_foo.a as a black box. That would, obviously, degrade the quality
>>> of complex plans involving security views, but I think we should focus
>>> on getting something that meets our security goals first and then try
>>> to improve performance later.
>> I tried to investigate the code around size-estimation, and it seems to
>> me here is two candidates to put this type of checks.
>> The one is examine_simple_variable() that is used to pull a datum
>> from statistic information, but it locates on the code path restriction
>> estimator of operators; so user controlable, although it requires least
>> code changes just after if (rte->rtekind == RTE_SUBQUERY).
>> The other is clause_selectivity(). Its code path is not user controlable,
>> so we can apply necessary checks to prevent qualifier that reference
>> variable come from sub-query with security-barrier.
>> In my sense, clause_selectivity() is better place to apply this type of
>> checks. But, on the other hand, it provides get_relation_stats_hook
>> to allow extensions to control references to statistic data.
>> So, I wonder whether the PG core assumes this routine covers
>> all the code path here?
> The attached patch adds checks around invocation of selectivity
> estimator functions, and it changes behavior of the estimator, if the
> supplied operator tries to touch variables come from security-barrier
> Then, it fixes the problem you mentioned.
> postgres=# explain select * from safe_foo where a !! 0;
> QUERY PLAN
> Subquery Scan on safe_foo (cost=0.00..2.70 rows=3 width=4)
> Filter: (safe_foo.a !! 0)
> -> Seq Scan on foo (cost=0.00..1.14 rows=6 width=4)
> Filter: (a > 5)
> (4 rows)
> However, I'm still a bit skeptical on my patch, because it still allows
> to invoke estimator function when operator's argument does not
> touch values originated from security-barrier relation.
> In the case when oprrest or oprjoin are implemented without our
> regular convention (E.g, it anyway reference whole of statistical data),
> it will break this solution.
> Of course, it is an assumption that we cannot prevent any attack
> using binary modules, so we need to say "use it your own risk" if
> people tries to use extensional modules. And, we also need to
> keep the built-in code secure.
> Some of built-in estimator functions (such as eqsel) provides
> a feature that invokes operator function with arguments originated
> from pg_statistics table. It didn't harmless, however, we got understand
> that this logic can be used to break row-level security.
> So, I begin to consider the routines to be revised are some of built-in
> functions to be used for estimator functions; such as eqsel and ...
> These function eventually reference statistical data at examine_variable.
> It might be a better approach to add checks whether invocation of
> the supplied operator possibly leaks contents to be invisible.
> It seems to me the Idea of the attached patch depends on something
> internal stuff of existing built-in estimator functions...
> KaiGai Kohei <kaigai(at)kaigai(dot)gr(dot)jp>
KaiGai Kohei <kaigai(at)kaigai(dot)gr(dot)jp>
In response to
pgsql-hackers by date
|Next:||From: Dimitri Fontaine||Date: 2011-12-12 17:12:25|
|Subject: Re: Command Triggers|
|Previous:||From: Andres Freund||Date: 2011-12-12 16:49:59|
|Subject: Re: Command Triggers|