Re: [HACKERS] MERGE SQL Statement for PG11

From: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Peter Geoghegan <pg(at)bowt(dot)ie>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>, Pavan Deolasee <pavan(dot)deolasee(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] MERGE SQL Statement for PG11
Date: 2018-03-16 05:39:07
Message-ID: CABOikdPxjUak5Q6gAh+p7vNAUx0tGX8ZCGChstYjfNoew2Gixg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Stephen,

On Fri, Mar 16, 2018 at 7:28 AM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:

> Greetings Pavan, all,
>
> * Pavan Deolasee (pavan(dot)deolasee(at)gmail(dot)com) wrote:
> > On 9 March 2018 at 08:29, Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> > > My #1 concern has become RLS, and
> > > perhaps only because I haven't studied it in enough detail.
> >
> > Sure. I've done what I thought is the right thing to do, but please
> check.
> > Stephen also wanted to review RLS related code; I don't know if he had
> > chance to do so.
>
> I've started looking through the code from an RLS perspective and, at
> least on my initial review, it looks alright.
>

Thanks for taking time out to review the patch. It certainly helps a lot.

>
> A couple test that aren't included that I think should be in the
> regression suire are where both tables have RLS policies and
> where the RLS tables have actual SELECT policies beyond just 'true'.
>
> I certainly see SELECT policies which limits the records that
> individuals are allowed to see very frequently and it's an
> important case to ensure works correctly. I did a few tests here myself
> and they behaved as I expected, and reading through the code it looks
> reasonable, but they should be simple to write tests which run very
> quickly.
>

Ok. I will add those tests.

>
> I'm a bit on the fence about it, but as MERGE is added in quite a few
> places which previously mentioned UPDATE and DELETE throughout the
> system, I wonder if we shouldn't do better than this:
>
> =*# create policy p1 on t1 for merge using ((c1 % 2) = 0);
> ERROR: syntax error at or near "merge"
> LINE 1: create policy p1 on t1 for merge using ((c1 % 2) = 0);
>
> Specifically, perhaps we should change that to pick up on MERGE being
> asked for there and return an error saying that policies for the MERGE
> command aren't supported with a HINT that MERGE respects
> INSERT/UPDATE/DELETE policies and that users should use those instead.
>

Hmm. I am not sure if that would be a good idea just for RLS. We might then
also want to change several other places in the grammar to accept
INSERT/UPDATE/DELETE keyword and handle that during parse analysis. We can
certainly do that, but I am not sure if it adds a lot of value and
certainly adds a lot more code. We should surely document these things, if
we are not already.

>
> Also, in nodeModifyTable.c, there's a comment:
>
> * The WITH CHECK quals are applied in ExecUpdate() and hence we need
> * not do anything special to handle them.
>
> Which I believe is actually getting at the fact that ExecUpdate() will
> run ExecWithCheckOptions(WCO_RLS_UPDATE_CHECK ...) and therefore in
> ExecMergeMatched() we don't need to check WCO_RLS_UPDATE_CHECK, but we
> do still need to check WCO_RLS_MERGE_UPDATE_CHECK (and that's what the
> code does).

Right.

> One thing I wonder about there though is if we really need
> to segregate those..? What's actually making WCO_RLS_MERGE_UPDATE_CHECK
> different from WCO_RLS_UPDATE_CHECK? I get that the DELETE case is
> different, because in a regular DELETE we'll never even see the row, but
> for MERGE, we will see the row (assuming it passes SELECT policies, of
> course) and then will check if it matches and that's when we realize
> that we've been asked to run a DELETE, so we have the special-case of
> WCO_RLS_MERGE_DELETE_CHECK, but that's not true of UPDATE, so I think
> this might be adding a bit of unnecessary complication by introducing
> WCO_RLS_MERGE_UPDATE_CHECK.
>

I've modelled this code on the lines of ON CONFLICT DO NOTHING. And quite
similar to that, I believe we need separate handling for MERGE as well
because we don't (and can't) push down the USING security quals of
UPDATE/DELETE to the scan. So we need to separately check that the target
row actually passes the USING quals. WCO_RLS_MERGE_UPDATE_CHECK and
WCO_RLS_MERGE_DELETE_CHECK
are used for that purpose.

One point to deliberate though is whether it's a good idea to throw an
error when the USING quals fail or should we silently ignore such rows.
Regular UPDATE/DELETE does the latter and ON CONFLICT DO UPDATE does the
former. I chose to throw an error because otherwise it may get confusing to
the user since a row would neither be updated (meaning, it will be seen as
a case of NOT MATCHED), but nor be inserted. I hope no problem there.

Thanks,
Pavan

--
Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2018-03-16 05:47:47 Re: [bug fix] Cascaded standby cannot start after a clean shutdown
Previous Message Tsunakawa, Takayuki 2018-03-16 05:27:58 RE: [bug fix] Cascaded standby cannot start after a clean shutdown