Re: INSERT ... ON CONFLICT UPDATE and RLS

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
Cc: Peter Geoghegan <pg(at)heroku(dot)com>, David Fetter <david(at)fetter(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: INSERT ... ON CONFLICT UPDATE and RLS
Date: 2015-01-09 20:26:07
Message-ID: 20150109202607.GE3062@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Dean, Peter,

* Dean Rasheed (dean(dot)a(dot)rasheed(at)gmail(dot)com) wrote:
> On 9 January 2015 at 08:49, Peter Geoghegan <pg(at)heroku(dot)com> wrote:
> > On Fri, Jan 9, 2015 at 12:19 AM, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> wrote:
> >> I was trying to think up an example where you might actually have
> >> different INSERT and UPDATE policies, and the best I can think of is
> >> some sort of mod_count column where you have an INSERT CHECK
> >> (mod_count = 0) and an UPDATE CHECK (mod_count > 0). In that case,
> >> checking both policies would make an UPSERT impossible, whereas if you
> >> think of it as doing either an INSERT or an UPDATE, as the syntax
> >> suggests, it becomes possible.
> >
> > Why does this user want to do this upsert? If they're upserting, then
> > the inserted row could only reasonably have a value of (mod_count =
> > 0). If updating, then they must have a constant value for the update
> > path (a value that's greater than 0, naturally - say 2), which doesn't
> > make any sense in the context of an upsert's auxiliary update - what
> > happened to the 0 value? Sorry, but I don't think your example makes
> > sense - I can't see what would motivate anyone to write a query like
> > that with those RLS policies in place. It sounds like you're talking
> > about an insert and a separate update that may or may not affect the
> > same row, and not an upsert. Then those policies make sense, but in
> > practice they render the upsert you describe contradictory.
> >
>
> Whoa, hang on. I think you're being a bit quick to dismiss that
> example. Why shouldn't I want an upsert where the majority of the
> table columns follow the usual "make it so" pattern of an upsert, but
> there is also this kind of audit column to be maintained? Then I would
> write something like
>
> INSERT INTO tbl (<some values>, 0)
> ON CONFLICT UPDATE SET <same values>, mod_count=mod_count+1;

That's a good point- it doesn't make sense to compare the INSERT values
against the UPDATE policy as the result of the UPDATE could be a
completely different tuple, one we can't know the contents of until we
actually find a conflicting tuple. We can't simply combine the policies
and run them at the same time, but I'm not sure that then means we
should let an INSERT .. ON CONFLICT UPDATE pass in values to the INSERT
which are not permitted by any INSERT policy on the relation.

Further, forgetting about RLS, what happens if a user doesn't have
INSERT rights on the mod_count column at all? We still allow the above
to execute if the row already exists? That doesn't strike me as a good
idea. In this case, I do think it makes sense to consider RLS and our
regular permissions system as they are both 'OR' cases. I might have
UPDATE rights for a particular column through multiple different roles
for a given relation, but if none of them give me INSERT rights for that
column, then I'd expect to get an error if I try to do an INSERT into
that column.

> The root of the problem is the way that you're proposing to combine
> the RLS policies (using AND), which runs contrary to the way RLS
> policies are usually combined (using OR), which is why this kind of
> example fails -- RLS policies in general aren't intended to all be
> true simultaneously.

I agree that RLS policies, in general, are not intended to all be true
simultaneously. I don't think it then follows that an INSERT .. ON
CONFLICT UPDATE should be allowed if, for example, there are no INSERT
policies on an RLS-enabled relation, because the call happens to end up
doing an UPDATE.

To try and outline another possible use-case, consider that you have
tuples coming in from two independent sets of users, orange and blue.
All users are allowed to INSERT, but the 'color' column must equal the
user's. For UPDATEs, the blue users can see both orange and blue
records while orange users can only see orange records. Rows resulting
from an UPDATE can be orange or blue for blue users, but must be orange
for orange users.

Further, in this environment, attempts by orange users to insert blue
records are flagged to be audited for review because orange users should
never have access to nor be handling blue data. With the approach
you're outlining, a command like:

INSERT INTO data VALUES (... , 'blue')
ON CONFLICT UPDATE set ... = ...;

run by an orange user wouldn't error if it happened to result in an
UPDATE (presuming, of course, that the UPDATE didn't try to change the
existing color), but from a security perspective, it's clearly an error
for an orange user to be attempting to insert blue data.

I don't see this as impacting the more general case of how RLS policies
are applied. An individual blue user might also be a member of some
orange-related role and it wouldn't make any sense for that user to then
be only able to see orange records (were we to AND RLS policies
together).

Where this leaves me, at least, is feeling like we should always apply
the INSERT WITH CHECK policy, then if there is a conflict, check the
UPDATE USING policy and throw an error if the row isn't visible but
otherwise perform the UPDATE and then check the UPDATE WITH CHECK
policy. I see your point that this runs counter to the 'mod_count'
example use-case and could cause problems for users using RLS with such
a strategy. For my part, I expect users of RLS to expect errors in such
a case rather than allowing it, but it's certainly a judgement call.

The only reasonable way that I can see to support both sides would be to
allow UPSERT to be a top-level policy definition in its own right and
let users specify exactly what is allowed in the UPSERT case (possibly
requiring three different expressions to represent the initial INSERT,
what the UPDATE can see, and what the UPDATE results in..). I tend to
doubt it would be worth it unless we end up supporting UPSERT-specific
triggers and permissions..

Thanks,

Stephen

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2015-01-09 20:43:57 Re: INSERT ... ON CONFLICT UPDATE and RLS
Previous Message Bruce Momjian 2015-01-09 20:04:19 Re: PQputCopyEnd doesn't adhere to its API contract