Re: Improving RLS planning

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Improving RLS planning
Date: 2016-11-03 22:23:18
Message-ID: 30304.1478211798@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> writes:
> On 25 October 2016 at 22:58, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> The alternative I'm now thinking about pursuing is to get rid of the
>> conversion of RLS quals to subqueries. Instead, we can label individual
>> qual clauses with security precedence markings.

> +1 for this approach. It looks like it could potentially be much
> simpler. There's some ugly code in the inheritance planner (and
> probably one or two other places) that it might be possible to chop
> out, which would probably also speed up planning times.

Here's a draft patch for this. I've only addressed the RLS use-case
so far, so this doesn't get into managing the order of application of
join quals, only restriction quals.

>> 2. In order_qual_clauses, sort first by security_level and second by cost.

> I think that ordering might be sub-optimal if you had a mix of
> leakproof quals and security quals and the cost of some security quals
> were significantly higher than the cost of some other quals. Perhaps
> all leakproof quals should be assigned security_level 0, to allow them
> to be checked earlier if they have a lower cost (whether or not they
> are security quals), and only leaky quals would have a security_level
> greater than zero. Rule 1 would then not need to check whether the
> qual was leakproof, and you probably wouldn't need the separate
> "leakproof" bool field on RestrictInfo.

Hm, but it would also force leakproof quals to be evaluated in front
of potentially-cheaper leaky quals, whether or not that's semantically
necessary.

I experimented with ignoring security_level altogether for leakproof
quals, but I couldn't make it work properly, because that didn't lead to
a comparison rule that satisfies transitivity. For instance, consider
three quals:
A: cost 1, security_level 1, leaky
B: cost 2, security_level 1, leakproof
C: cost 3, security_level 0, leakproof
A should sort before B, since same security_level and lower cost;
B should sort before C, since lower cost and leakproof;
but A must sort after C, since higher security_level and leaky.

So what I ended up doing was using your idea of forcing the security
level to 0 for leakproof quals, but only if they have cost below a
threshold (which I set at 10X cpu_operator_cost, which should take in
most built-in functions). That at least limits the possible damage
from forcing early evaluation of a leakproof qual. There may be
some better way to do it, though, so I didn't go so far as to remove
the separate leakproof flag.

Some other notes:

* This creates a requirement on scan-planning code (and someday on
join-planning code) to be sure it doesn't create violations of the qual
ordering rule. Currently only indxpath.c and tidpath.c have to worry
about that AFAICS. FDWs would need to worry about it too, except that
we don't currently allow RLS to be enabled on foreign tables. I'm a
little concerned about whether FDWs could create security holes by
not accounting for this, but it's moot for now. Custom scan providers
will need to pay attention as well.

* prepsecurity.c is now dead code and should be removed, but I did not
include that in this patch, since it would just bloat the patch.

* Accounting for the removal of prepsecurity.c, this is actually a net
savings of about 300 lines of code. So I feel pretty good about that.
It also gets rid of some really messy kluges, particularly the behavior
of generating new subquery RTEs as late as halfway through
grouping_planner. I find it astonishing that that worked at all.

* Since the planner is now depending on Query.hasRowSecurity to be set
whenever there are any securityQuals, I put in an Assert about that,
and promptly found three places in prepjointree.c and the rewriter where
we'd been failing to set it. I have not looked to see if these represent
live bugs in existing releases, but they might. Or am I misunderstanding
what the flag is supposed to mean?

* Aside from plan changes, there's one actual behavioral change in the
regression test results, but I think it's okay because it's a question
of whether to put a non-RLS qual before or after a leaky qual. That's
not something we are promising anything about.

* There's one test query in updatable_views.sql where the plan collapses
to a dummy (Result with constant false qual) because the planner is now
able to see that the qual conditions are mutually contradictory. Maybe
that query needs adjustment; I'm not sure what it's intending to test
exactly.

regards, tom lane

PS: I've been slacking on the commitfest because I wanted to push this
to a reasonably finished state before I set it aside to do CF work.
I'm not expecting it to be reviewed in this fest.

Attachment Content-Type Size
new-rls-planning-implementation-1.patch.gz application/x-gzip 28.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2016-11-03 23:16:18 Re: Row level security implementation in Foreign Table in Postgres
Previous Message Clifford Hammerschmidt 2016-11-03 22:04:17 C based plugins, clocks, locks, and configuration variables