Row Level Security − leakproof-ness and performance implications

From: Pierre Ducroquet <p(dot)psql(at)pinaraf(dot)info>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Row Level Security − leakproof-ness and performance implications
Date: 2019-02-19 18:37:22
Message-ID: 2811772.0XtDgEdalL@peanuts2
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello

In order to increase our security, we have started deploying row-level
security in order to add another safety net if any issue was to happen in our
applications.
After a careful monitoring of our databases, we found out that a lot of
queries started to go south, going extremely slow.
The root of these slowdowns is that a lot of the PostgreSQL functions are not
marked as leakproof, especially the ones used for operators.
In current git master, the following query returns 258 functions that are used
by operators returning booleans and not marked leakproof:

SELECT proname FROM pg_proc
WHERE exists(select 1 from pg_operator where oprcode::name = proname)
AND NOT(proleakproof)
AND prorettype = 16;

Among these, if you have for instance a table like:
create table demo_enum (
username text,
flag my_enum
);

With an index on (username, flag), the index will only be used on username
because flag is an enum and the enum_eq function is not marked leakproof.

For simple functions like enum_eq/enum_ne, marking them leakproof is an
obvious fix (patch attached to this email, including also textin/textout). And
if we had a 'RLS-enabled' context on functions, a way to make a lot of built-
in functions leakproof would simply be to prevent them from logging errors
containing values.

For others, like arraycontains, it's much trickier : arraycontains can be
leakproof only if the comparison function of the inner type is leakproof too.
This raises some interesting issues, like arraycontains being marked as strict
parallel safe while there is nothing making it mandatory for the inner type.
In order to optimize queries on a table like:
create table demo_array (username text, my_array int[]);
one would need to mark arraycontains leakproof, thus implying that any
comparison operator must be leakproof too? Another possibility, not that
complicated, would be to create specific-types entries in pg_proc for each
type that has a leakproof comparison operator. Or the complicated path would
be to have a 'dynamic' leakproof for functions like arraycontains that depend
on the inner type, but since this is determined at planning, it seems very
complicated to implement.

A third issue we noticed is the usage of functional indexes. If you create an
index on, for instance, (username, leaky_function(my_field)), and then search
on leaky_functon(my_field) = 42, the functional index can be used only if
leaky_function is marked leakproof, even if it is never going to be executed
on invalid rows thanks to the index. After a quick look at the source code, it
also seems complicated to implement since the decision to reject potential
dangerous calls to leaky_function is done really early in the process, before
the optimizer starts.

I am willing to spend some time on these issues, but I have no specific
knowledge of the planner and optimizer, and I fear proper fixes would require
much digging there. If you think it would be appropriate for functions like
arraycontains or range_contains to require the 'inner' comparison function to
be leakproof, or just keep looking at the other functions in pg_proc and
leakproof the ones that can be, I would be happy to write the corresponding
patches.

Best regards

Attachment Content-Type Size
0001-Mark-textin-out-and-enum_eq-ne-as-leakproof.patch text/x-patch 1.8 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2019-02-19 18:43:14 Re: WAL insert delay settings
Previous Message Andres Freund 2019-02-19 18:35:25 Re: WAL insert delay settings