Re: Improving RLS planning

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Improving RLS planning
Date: 2016-11-28 23:55:22
Message-ID: 20161128235522.GH13284@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tom,

* Tom Lane (tgl(at)sss(dot)pgh(dot)pa(dot)us) wrote:
> Stephen Frost <sfrost(at)snowman(dot)net> writes:
> > * Dean Rasheed (dean(dot)a(dot)rasheed(at)gmail(dot)com) wrote:
> >> +1 for that terminology and no renaming of fields.
>
> > Agreed.
>
> Here's an updated version of the RLS planning patch that gets rid of
> the incorrect interaction with Query.hasRowSecurity and adjusts
> terminology as agreed.

I've spent a fair bit of time going over this change to understand it,
how it works, and how it changes the way RLS and securiy barrier views
work.

Overall, I'm happy with how it works and don't see any serious issues
with the qual ordering or the general concept. I did have a few
comments from my review:

> diff --git a/src/backend/optimizer/README b/src/backend/optimizer/README
> [...]
> + Additional rules will be needed to support safe handling of join quals
> + when there is a mix of security levels among join quals; for example, it
> + will be necessary to prevent leaky higher-security-level quals from being
> + evaluated at a lower join level than other quals of lower security level.
> + Currently there is no need to consider that since security-prioritized
> + quals can only be single-table restriction quals coming from RLS policies
> + or security-barrier views, and thus enforcement only needs to happen at
> + the table scan level. With such extra rules, it should be possible to let
> + security-barrier views be flattened into the parent query, allowing more
> + flexibility of planning while still preserving required ordering of qual
> + evaluation. But that will come later.

Are you thinking that we will be able to remove the cut-out in
is_simple_subquery() which currently punts whenever an RTE is marked as
security_barrier? That would certainly be excellent as, currently, even
if everything involved is leakproof, we may end up with a poor choice of
plan because the join in the security barrier view must be performed
first. Consider a case where we have two relativly large tables being
joined together in a security barrier view, but a join from outside
which is against a small table. A better plan would generally be to
join with the smaller table first and then join the resulting small set
against the remaining large table.

Speaking of which, it seems like we should probably update the README to
include some mention, at least, of what we're doing today when it comes
to joins which involve security barrier entanglements.

> diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
[...]
> ! /*
> ! * In addition to the quals inherited from the parent, we might have
> ! * securityQuals associated with this particular child node. (This
> ! * won't happen in inheritance cases, only with appendrels originating
> ! * from UNION ALL.) Pull them up into the baserestrictinfo for the
> ! * child. This is similar to process_security_barrier_quals() for the
> ! * parent rel, except that we can't make any general deductions from
> ! * such quals, since they don't hold for the whole appendrel.
> ! */

Right, this won't happen in inheritance cases because we explicitly
don't consider the quals of the children when querying through the
parent, similar to how we don't consider the GRANT-based permissions on
the child tables. This is mentioned elsewhere but might make sense to
also mention it here, or at least say 'see expand_inherited_rtentry()'.

> *************** subquery_push_qual(Query *subquery, Rang
> *** 2708,2714 ****
> make_and_qual(subquery->jointree->quals, qual);
>
> /*
> ! * We need not change the subquery's hasAggs or hasSublinks flags,
> * since we can't be pushing down any aggregates that weren't there
> * before, and we don't push down subselects at all.
> */
> --- 2748,2754 ----
> make_and_qual(subquery->jointree->quals, qual);
>
> /*
> ! * We need not change the subquery's hasAggs or hasSubLinks flags,
> * since we can't be pushing down any aggregates that weren't there
> * before, and we don't push down subselects at all.
> */

Seems like this change is unrelated to what this patch is about. Not a
big deal, but did take me a second to realize that you were just
changing the case of the 'L' in hasSubLinks.

> + * We also reject proposed equivalence clauses if they contain leaky functions
> + * and have security_level above zero. The EC evaluation rules require us to
> + * apply certain tests at certain joining levels, and we can't tolerate
> + * delaying any test on security_level grounds. By rejecting candidate clauses
> + * that might require security delays, we ensure it's safe to apply an EC
> + * clause as soon as it's supposed to be applied.
[...]
> + /* Reject if it is potentially postponable by security considerations */
> + if (restrictinfo->security_level > 0 && !restrictinfo->leakproof)
> + return false;

The first comment makes a lot of sense, but the one-liner doesn't seem
as clear, to me anyway.

The result of the above, as I understand it, is that security_level will
either be zero, or the restrictinfo will be leakproof, no? Meaning that
ec_max_security will either be zero, or the functions involved will be
leakproof, right?

> *************** select_equality_operator(EquivalenceClas
[...]
> --- 1352,1364 ----
>
> opno = get_opfamily_member(opfamily, lefttype, righttype,
> BTEqualStrategyNumber);
> ! if (!OidIsValid(opno))
> ! continue;
> ! /* If no barrier quals in query, don't worry about leaky operators */
> ! if (ec->ec_max_security == 0)
> ! return opno;
> ! /* Otherwise, insist that selected operators be leakproof */
> ! if (get_func_leakproof(get_opcode(opno)))
> return opno;
> }
> return InvalidOid;

Leading me to wonder if the above ever actually falls through to the
InvalidOid case due to ec_max_security > 0 and the operator not being
leakproof. Reviewing the coverage-html output, it looks like the only
cases where InvalidOid is returned is when no operator can be found (and
that only happens 20 times throughout the regression tests, and only
through two of the many code paths that call this function).

Perhaps it's more difficult than it's worth to come up with cases that
cover the other code paths involved, but it seems like it might be good
to at least try to as it's likely to happen in more cases now that we're
returning (or should be, at least) InvalidOid due to the only operators
found being leaky ones.

> diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
[...]
> + /*
> + * If a clause is leakproof, it doesn't have to be constrained by
> + * its nominal security level. If it's also reasonably cheap
> + * (here defined as 10X cpu_operator_cost), pretend it has
> + * security_level 0, which will allow it to go in front of
> + * more-expensive quals of lower security levels. Of course, that
> + * will also force it to go in front of cheaper quals of its own
> + * security level, which is not so great, but we can alleviate
> + * that risk by applying the cost limit cutoff.
> + */
> + if (rinfo->leakproof && items[i].cost < 10 * cpu_operator_cost)
> + items[i].security_level = 0;
> + else
> + items[i].security_level = rinfo->security_level;
> + }
> + else
> + items[i].security_level = 0;
> i++;
> }

As discussed previously, this looks like a good, practical, hack, but I
feel a little bad that we don't mention it anywhere except in this
comment. Is it too low-level to get a mention in the README?

> diff --git a/src/test/regress/expected/updatable_views.out b/src/test/regress/expected/updatable_views.out
[...]
> --- 2104,2114 ----
>
> EXPLAIN (VERBOSE, COSTS OFF)
> UPDATE v1 SET a=100 WHERE snoop(a) AND leakproof(a) AND a = 3;
> ! QUERY PLAN
> ! --------------------------
> ! Result
> ! One-Time Filter: false
> ! (2 rows)

Perhaps Dean recalls something more specific, but reviewing this test
and the others around it, I believe it was simply to see what happens in
a case which doesn't pass the security barrier view constraint and to
cover the same cases with the UPDATE as were in the SELECTs above it. I
don't see it being an issue that it now results in a one-time filter:
false result.

Reviewing the other regression test changes, they all look good to me.

Thanks!

Stephen

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2016-11-29 00:54:10 Re: Fix comment in build_simple_rel
Previous Message Nico Williams 2016-11-28 23:18:19 Re: matview incremental maintenance