Re: [RFC] Interface of Row Level Security

From: Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PgHacker <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [RFC] Interface of Row Level Security
Date: 2012-05-31 20:55:27
Message-ID: CADyhKSX36tpJy86jC1QKZLN085BWr7JQoaNh8-OvdLnaMfC2qw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

2012/5/31 Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>:
> 2012/5/31 Robert Haas <robertmhaas(at)gmail(dot)com>:
>>> If we would have an "ideal optimizer", I'd still like the optimizer to
>>> wipe out redundant clauses transparently, rather than RLSBYPASS
>>> permissions, because it just controls all-or-nothing stuff.
>>> For example, if tuples are categorized to unclassified, classified or
>>> secret, and RLS policy is configured as:
>>>  ((current_user IN ('alice', 'bob') AND X IN ('unclassified',
>>> 'classified')) OR (X IN 'unclassified)),
>>> superuser can see all the tuples, and alice and bob can see
>>> up to classified tuples.
>>> Is it really hard to wipe out redundant condition at planner stage?
>>> If current_user is obviously 'kaigai', it seems to me the left-side of
>>> this clause can be wiped out at the planner stage.
>>> Do I consider the issue too simple?
>>
>> Yes.  :-)
>>
>> There are two problems.  First, if using the extended query protocol
>> (e.g. pgbench -M prepared) you can prepare a statement just once and
>> then execute it multiple times.  In this case, stable-functions cannot
>> be constant-folded at plan time, because they are only guaranteed to
>> remain constant for a *single* execution of the query, not for all
>> executions of the query.  So any optimization in this area would have
>> to be limited to cases where the simple query protocol is used.  I
>> think that might still be worth doing, but it's a significant
>> limitation, to be sure.  Second, at present, there is no guarantee
>> that the snapshot used for planning the query is the same as the
>> snapshot used for executing the query, though commit
>> d573e239f03506920938bf0be56c868d9c3416da made that happen in some
>> common cases.  If we were to do constant-folding of stable functions
>> using the planner snapshot, it would represent a behavior change from
>> previous releases.  I am not clear whether that has any real-world
>> consequences that we should be worried about.  It seems to me that the
>> path of least resistance might be to refactor the portal stuff so that
>> we can provide a uniform guarantee that, when using the simple query
>> protocol, the planner and executor snapshots will be the same ... but
>> I might be wrong.
>>
> It may be an option to separate the case into two; a situation to execute
> the given query immediately just after optimization and never reused,
> and others.
> Even though the second situation, it may give us better query execution
> plan, if we try to reconstruct query plan just before executor with
> assumption that expects immutable / stable function can be replaced
> by constant value prior to execution.
> In other words, this idea tries to query optimization again on EXECUTE
> statement against to its nature, to replace immutable / stable functions
> by constant value, and to generate wiser execute plan.
> At least, it may make sense to have a flag on prepared statement to
> indicate whether it has possible better plan with this re-construction.
>
> Then, if so, we will be able to push the stuff corresponding to
> RLSBYPASS into the query optimization, and works transparently
> for users.
>
> Isn't it feasible to implement?
>
If we could replace a particular term that consists of constant values
and stable / immutable functions only by parameter references,
it may enable to handle the term as if a constant value, but actual
calculation is delayed to executor stage.

For example, according to this idea,
PREPARE p1(int) AS SELECT * FROM tbl WHERE
current_user in ('alice','bob') AND X > $1;
shall be internally rewritten to,
PREPARE p1(int) AS SELECT * FROM tbl WHERE
$2 AND X>$1;

then, $2 is implicitly calculated just before execution of this prepared
statement. The snapshot to be used for this calculation is same with
executor's one. It seems to me it is a feasible idea with less invasive
implementation to existing planner.

Does it make sense to describe exceptional condition using regular
clause, instead of special permission?

Thanks,
--
KaiGai Kohei <kaigai(at)kaigai(dot)gr(dot)jp>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Bryan Murphy 2012-05-31 20:55:45 pg_upgrade from 9.0.7 to 9.1.3: duplicate key pg_authid_oid_index
Previous Message Kohei KaiGai 2012-05-31 19:52:38 Re: [RFC] Interface of Row Level Security