Re: A mechanism securing web applications in DBMS

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Zhaomo Yang <zhy001(at)cs(dot)ucsd(dot)edu>
Cc: pgsql-hackers(at)postgresql(dot)org, tgl(at)sss(dot)pgh(dot)pa(dot)us, craig(at)2ndquadrant(dot)com, kaigai(at)ak(dot)jp(dot)nec(dot)com, Kirill Levchenko <klevchen(at)cs(dot)ucsd(dot)edu>
Subject: Re: A mechanism securing web applications in DBMS
Date: 2014-09-05 01:53:05
Message-ID: 20140905015305.GL16422@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Zhaomo,

* Zhaomo Yang (zhy001(at)cs(dot)ucsd(dot)edu) wrote:
> I am a graduate student from UC San Diego. My adviser, Dr. Kirill
> Levchenko, and I have been working on a web/DB security project for
> the last few months. Since fine-grained access control in DBMS is part
> of our project and the PostgreSQL community is also working on it now,
> we would like to exchange some ideas on the issue with you.

Fantastic! Very interested.

> 1. Background
> Nowadays people entrust more and more sensitive data to web
> applications, while security vulnerabilities are common in these
> applications. The normal structure of web applications consists of
> three tiers: client-side code, server-side code and a database. In
> server-side code a database user representing all the
> application-level users interacts with the database with full
> privileges. Since currently database built-in access control is too
> coarse, no access control mechanism is used in most of web
> applications. This situation is not ideal since a malicious
> application user can tamper with other users’ data by exploiting
> vulnerabilities in the application, and if the application is
> completely under the malicious user’s control so is the database.

Agreed- we are certainly working to improve that situation, though
consideration must also be given to how our catalogs are structured and
that they are unlikely to support "web-scale" numbers of individual
roles as currently implemented. We're in a better spot than we were a
few years ago (we used to have flat files for roles...), but we lack any
partitioning capability in the catalogs today.

It's entirely possible that, in some situations, the inability of PG to
support the number of roles has dwarfed the permission granularity
concern. Even today there are mechanisms available to limit what an
individual role can view, add, update, or delete (though they are less
than ideal, rather cludgy to work with and could be better).

> 2. Our Mechanism.
> In order to fix the problem, we believe there are two underlying
> problems to be addressed: authenticating application-level users to
> DBMS and fine-grained access control in the DBMS. Note that these two
> problems are related. Without authenticating application-level users
> to the DBMS, can the DBMS not know which application-level user it is
> actually interacting with, thus the DBMS cannot apply fine-grained
> access control policy correctly.

Agreed- PG needs to know the user to be able to get to the level of
returning only what that individual user is allowed to see.

> 2.1 The authentication problem
> We address the authentication problem by requiring developers to
> define an authentication function in the DBMS. This function is
> invoked whenever an application-level user logs in. An authentication
> function contains the authentication logic in the server-side code.
> Besides, it also caches some attributes of the current
> application-level user, which will be checked by the fine-grained
> access control mechanism.

This is one approach which can be used, though PG has a reasonably
complex authentication system which applications can leverage (consider
Kerberos proxying and client-side certificates), rather than having the
application independently develop an authentication system. Still, if
the application is to be responsible for the authentication, then a
simple 'SET ROLE' can be done in PG to switch to the context of an
individual user.

> In our implementation, we added a new type of function called
> authentication function. Developers are supposed to use CREATE
> AUTHENTICATION FUNCTION command to define authentication function,
> which contains the application authentication logic and must be
> defined to return table rows. When the authentication function is
> invoked, the return values are cached in a temp table, called the
> authentication table, with the same name as the authentication
> function. The authentication table is available to access control
> statements (GRANT-WHERE statements).

It'd be great if you could explain the actual difference, in PG, between
these AUTHENTICATION functions and regular functions (or perhaps
security definer functions), which can also create tables and return
rows.

> 2.2 The fined-grained access control problem
>
> We address the fined-grained access control problem by using
> GRANT-WHERE statements from [1]. Concretely, the syntax of GRANT-WHERE
> statements is
>
> GRANT <privilege> [, ...] ON <relname> TO <user> [, ...]
> USING <referenced relname> [, ...]
> WHERE <predicate>;
>
> where <privilege> can be SELECT, INSERT, UPDATE or DELETE, the USING
> clause lists the tables of which should be available to the predicate
> and the WHERE clause specifies the predicate.

Certainly an interesting alternative, though you run the very serious
risk that the SQL standard will come up with an extention to the GRANT
syntax that will end up causing problems long-term (this is part of the
reason for the CREATE POLICY syntax that we're currently working
towards).

> 3. Integrate our mechanism to PostgreSQL.
>
> 3.1 Authentication Function
>
> The major challenge of this part is how to cache users’ authentication
> information. In our prototype implementation, an authentication table
> is actually a temporary table. But it makes our GRANT-WHERE statements
> also temporary since they refer to the authentication table. Since
> every session has its own authentication table and all the
> authentication tables share the same schema, the global temporary
> table [2] would be a perfect fit for caching users’ authentication
> information. Also, it will not make GRANT-WHERE statements temporary
> anymore.

This certainly deserves further discussion. What is the reason that a
cacheing mechanism into a temporary table is necessary..? Are these
session identifiers to keep track of already-authenticated users? How
does that play with a connection pooler (which is a very real
consideration when thinking about anything web-scale).

> 3.2 Fined-Grained Access Control
>
> We noticed that the PostgreSQL community is currently working on
> fine-grained access control. Access control policies can be defined in
> form of “ALTER TABLE … SET ROW SECURITY” statement. Although
> functionally “ALTER TABLE … SET ROW SECURITY” statements and
> GRANT-WHERE statements are equivalent, however, there are some
> advantages in the latter one:
>
> 1) GRANT-WHERE statement is very similar to the SQL GRANT statement,
> which should be already familiar to most of the developers, thus the
> developers may feel more comfortable to use it.

To be frank, I don't really buy this argument. There are certainly
arguments to be had for why GRANT would make sense instead of ALTER
TABLE, but "developer comfort" isn't one of them, and the concern
regarding the SQL specification for the GRANT command trumps it.

> 2) In GRANT-WHERE syntax, a DB user can be specified as the recipient
> of the GRANT. In the current proposed row security syntax, one needs
> to write a big statement containing all the predicates for all the
> possible recipients per table per operation.

This isn't actually accurate- the latest CREATE POLICY syntax includes
the ability to specify roles to which the policy is to be applied.
Further, the next patch will include multiple expressions to address
what happens during UPDATE (where you have to consider what rows are
allowed to be selected, and what rows are allowed to be added back into
the table). That would certainly further complicate the already quite
complicated GRANT syntax, and not to much advantage that I can see.

> For example, suppose we have the following two GRANT-WHERE statements on table1

[...]

> As you can see, this syntax requires us to combine two logically
> separate statement into one statement. Also, it can be hard for
> developers to debug.

We've moved off of ALTER TABLE for this approach for a number of quite
reasonable reasons and have moved to an independent CREATE POLICY
approach, which addresses the above concerns.

That said, I do feel that in many situations the reality is that users
will simply create a stored procedure (in plpgsql or similar) which then
evaluates what is being requested against another table or possibly even
an external policy system (eg: SELinux) to determine if the record
should be returned. This runs the risk of not being as performant,
which is absolutely a concern we are considering, but supporting the
per-role policies (which are already in the latest patch) will hopefully
address some of that concern, and we may be able to do even better in
the future.

> 4. Collaboration.
>
> We are looking forward to joining forces with PostgreSQL community on
> this problem. More specifically, we have the following suggestions
> which we hope the community would consider:
>
> 1) Using GRANT-WHERE syntax for the row-level access control. If for
> any reasons it is impossible, we hope at least a clause will be added
> to the “ALTER TABLE … SET ROW SECURITY” syntax such that a recipient
> can be specified;

Right, already done and integrated into the row-security patch (though
the specific syntax is CREATE POLICY, as discussed previously). We
certainly welcome feedback on that patch and would love to discuss
further- feel free to contact me directly and we'll set up time to
discuss.

> 2) Restarting Global Temporary Table project;

I'm certainly for this.

> 3) Accepting “CREATE AUTHENTICATION FUNCTION” as a variant of “CREATE
> FUNCTION” command.

More information about how it's different from CREATE FUNCTION would be
very helpful for this.

> We’d like to discuss the mechanisms above with the community further,
> and if any of the mechanisms are accepted by the community, we also
> would like to take part in implementing it. Please feel free to
> contact us if you have any questions.

Consider it done- I'd like to collaborate with you on the row-security
discussion primairly, but also about how to manage very many users in
PG and the authentication challenges.

Thanks!

Stephen

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jan Wieck 2014-09-05 02:04:40 Re: jsonb format is pessimal for toast compression
Previous Message Peter Geoghegan 2014-09-05 00:18:59 Re: B-Tree support function number 3 (strxfrm() optimization)