Re: MERGE ... RETURNING

From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
Cc: Vik Fearing <vik(at)postgresfriends(dot)org>, Jeff Davis <pgsql(at)j-davis(dot)com>, Gurjeet Singh <gurjeet(at)singh(dot)im>, Isaac Morland <isaac(dot)morland(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: MERGE ... RETURNING
Date: 2023-11-01 17:19:37
Message-ID: CAHyXU0xeTggH4XwLvR2HbpoOmK_-mKofsAWoEonGnFfSbPg0JQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Nov 1, 2023 at 5:12 AM Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
wrote:

> On Tue, 31 Oct 2023 at 23:19, Vik Fearing <vik(at)postgresfriends(dot)org> wrote:
> >
> > On 10/31/23 19:28, Jeff Davis wrote:
> >
> > > Assuming we have one RETURNING clause at the end, then it creates the
> > > problem of how to communicate which WHEN clause a tuple came from,
> > > whether it's the old or the new version, and/or which action was
> > > performed on that tuple.
> > >
> > > How do we communicate any of those things? We need to get that
> > > information into the result table somehow, so it should probably be
> > > some kind of expression that can exist in the RETURNING clause. But
> > > what kind of expression?
> > >
> > > (a) It could be a totally new expression kind with a new keyword (or
> > > recycling some existing keywords for the same effect, or something that
> > > looks superficially like a function call but isn't) that's only valid
> > > in the RETURNING clause of a MERGE statement. If you use it in another
> > > expression (say the targetlist of a SELECT statement), then you'd get a
> > > failure at parse analysis time.
> >
> > This would be my choice, the same as how the standard GROUPING()
> > "function" for grouping sets is implemented by GroupingFunc.
> >
>
> Something I'm wondering about is to what extent this discussion is
> driven by concerns about aspects of the implementation (specifically,
> references to function OIDs in code), versus a desire for a different
> user-visible syntax. To a large extent, those are orthogonal
> questions.
>
> (As an aside, I would note that there are already around a dozen
> references to specific function OIDs in the parse analysis code, and a
> lot more if you grep more widely across the whole of the backend
> code.)
>
> At one point, as I was writing this patch, I went part-way down the
> route of adding a new node type (I think I called it MergeFunc), for
> these merge support functions, somewhat inspired by GroupingFunc. In
> the end, I backed out of that approach, because it seemed to be
> introducing a lot of unnecessary additional complexity, and I decided
> that a regular FuncExpr would suffice.
>
> If pg_merge_action() and pg_merge_when_clause_number() were
> implemented using a MergeFunc node, it would reduce the number of
> places that refer to specific function OIDs. Basically, a MergeFunc
> node would be very much like a FuncExpr node, except that it would
> have a "levels up" field, set during parse analysis, at the point
> where we check that it is being used in a merge returning clause, and
> this field would be used during subselect planning. Note, however,
> that that doesn't entirely eliminate references to specific function
> OIDs -- the parse analysis code would still do that. Also, additional
> special-case code in the executor would be required to handle
> MergeFunc nodes. Also, code like IncrementVarSublevelsUp() would need
> adjusting, and anything else like that.
>
> A separate question is what the syntax should be. We could invent a
> new syntax, like GROUPING(). Perhaps:
>
> MERGE(ACTION) instead of pg_merge_action()
> MERGE(CLAUSE NUMBER) instead of pg_merge_when_clause_number()
>

Hm, still struggling with this merge action and (especially) number stuff.
Currently we have:

WHEN MATCHED [ AND *condition* ] THEN { *merge_update* |
*merge_delete* | DO NOTHING } |
WHEN NOT MATCHED [ AND *condition* ] THEN { *merge_insert* | DO NOTHING } }

What about extending to something like:

WHEN MATCHED [ AND *condition* ] [ AS *merge_clause_name ]*

WHEN MATCHED AND tid > 2 AS giraffes THEN UPDATE SET balance = t.balance +
delta

...and have pg_merge_clause() return 'giraffes' (of name type). If merge
clause is not identified, maybe don't return any data for that clause
through returning,, or return NULL. Maybe 'returning' clause doesn't have
to be extended or molested in any way, it would follow mechanics as per
'update', and could not refer to identified merge_clauses, but would allow
for pg_merge_clause() functioning. You wouldn't need to identify action or
number. Food for thought, -- may have missed some finer details upthread.

for example,
with r as (
merge into x using y on x.a = y.a
when matched and x.c > 0 as good then do nothing
when matched and x.c <= 0 as bad then do nothing
returning pg_merge_clause(), x.*
) ...

yielding
pg_merge_clause a c
good 1 5
good 2 7
bad 3 0
...

...maybe allow pg_merge_clause() take to optionally yield column name:
returning pg_merge_clause('result'), x.*
) ...

yielding
result a c
good 1 5
good 2 7
bad 3 0
...

merlin

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Davis 2023-11-01 17:49:02 Re: MERGE ... RETURNING
Previous Message Ben Snaidero 2023-11-01 16:58:17 max_standby_streaming_delay setting not cancelling query on replica