Improving RLS planning

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgreSQL(dot)org
Subject: Improving RLS planning
Date: 2016-10-25 21:58:21
Message-ID: 8185.1477432701@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Currently, we don't produce very good plans when row-level security
is enabled. An example is that, given

create table t1 (pk1 int primary key, label text);
create table t2 (pk2 int primary key, fk int references t1);

then for

select * from t1, t2 where pk1 = fk and pk2 = 42;

you would ordinarily get a cheap plan like

Nested Loop
-> Index Scan using t2_pkey on t2
Index Cond: (pk2 = 42)
-> Index Scan using t1_pkey on t1
Index Cond: (pk1 = t2.fk)

But stick an RLS policy on t1, and that degrades to a seqscan, eg

Nested Loop
Join Filter: (t1.pk1 = t2.fk)
-> Index Scan using t2_pkey on t2
Index Cond: (pk2 = 42)
-> Seq Scan on t1
Filter: (label = 'public'::text)

The reason for this is that we implement RLS by turning the reference
to t1 into a sub-SELECT, and the planner's recursive invocation of
subquery_planner produces only a seqscan path for t1, there not being
any reason visible in the subquery for it to do differently.

I have been thinking about improving this by allowing subquery_planner
to generate parameterized paths; but the more I think about that the
less satisfied I am with it. It will be quite expensive and probably
will still fail to find desirable plans in many cases. (I've not given
up on parameterized subquery paths altogether --- I just feel it'd be a
brute-force and not very effective way of dealing with RLS.)

The alternative I'm now thinking about pursuing is to get rid of the
conversion of RLS quals to subqueries. Instead, we can label individual
qual clauses with security precedence markings. Concretely, suppose we
add an "int security_level" field to struct RestrictInfo. The semantics
of this would be that a qual with a lower security_level value must be
evaluated before a qual with a higher security_level value, unless the
latter qual is leakproof. (It would likely also behoove us to add a
"leakproof" bool field to struct RestrictInfo, to avoid duplicate
leakproof-ness checks on quals. But that's just an optimization.)

In the initial implementation, quals coming from a RangeTblEntry's
securityQuals field would have security_level 0, quals coming from
anywhere else would have security_level 1; except that if we know
there are no security quals anywhere (ie not Query->hasRowSecurity),
we could give all quals security_level 0. (I think this exception
may be worth making because there's no need to test leakproofness
for a qual with security level 0; it could never be a candidate
for security delay anyway.)

Having done that much, I think all we need in order to get rid of
RLS subqueries, and just stick RLS quals into their relation's
baserestrictinfo list, are two rules:

1. When selecting potential indexquals, a RestrictInfo can be considered
for indexqual use only if it is leakproof or has security_level <= the
minimum among the table's baserestrictinfo clauses.

2. In order_qual_clauses, sort first by security_level and second by cost.

This would already be enough of a win to be worth doing. I think though
that this mechanism can be extended to also allow getting rid of the
restriction that security-barrier views can't be flattened. The idea
would be to make sure that quals coming from above the SB view are given
higher security_level values than quals within the SB view. We'd need
some extra mechanism to make that possible --- perhaps an additional kind
of node within jointree nests to show where there had been a
security-barrier boundary, and then some smarts in distribute_qual_to_rels
to prevent pushing upper quals down past a lower qual of strictly lesser
security level. But that can come later. (We do not need such smarts
to fix the RLS problem, because in the initial version, quals with lower
security level than another qual could only exist at the baserel level.)

In short, I'm proposing to throw away the entire existing implementation
for planning of RLS and SB views, and start over.

There are some corner cases I've not entirely worked out, in particular
what security_level to assign to quals generated from EquivalenceClasses.
A safe but not optimal answer would be to assign them the maximum
security_level of any source clause of the EC. Maybe it's not worth
working harder than that, because most equality operators are leakproof
anyway, so that it wouldn't matter what level we assigned them.

Before I start implementing this, can anyone see a fatal flaw in the
design?

regards, tom lane

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jim Nasby 2016-10-25 22:27:51 Re: Does it make sense to add a -q (quiet) flag to initdb?
Previous Message Michael Paquier 2016-10-25 21:56:13 Re: pg_basebackup stream xlog to tar