Re: MERGE ... RETURNING

From: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Jeff Davis <pgsql(at)j-davis(dot)com>, Gurjeet Singh <gurjeet(at)singh(dot)im>, Isaac Morland <isaac(dot)morland(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: MERGE ... RETURNING
Date: 2023-10-27 14:46:37
Message-ID: CAEZATCX5bQHczt_aSiB+1qQaMnSGAhHBceuqWPwiDc64BGXokw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, 25 Oct 2023 at 02:07, Merlin Moncure <mmoncure(at)gmail(dot)com> wrote:
>
> On Tue, Oct 24, 2023 at 2:11 PM Jeff Davis <pgsql(at)j-davis(dot)com> wrote:
>>
>> Can we revisit the idea of a per-WHEN RETURNING clause? The returning
>> clauses could be treated kind of like a UNION, which makes sense
>> because it really is a union of different results (the returned tuples
>> from an INSERT are different than the returned tuples from a DELETE).
>> You can just add constants to the target lists to distinguish which
>> WHEN clause they came from.
>>
> Yeah. Side benefit, the 'action_number' felt really out of place, and that neatly might solve it. It doesn't match tg_op, for example. With the current approach, return a text, or an enum? Why doesn't it match concepts that are pretty well established elsewhere? SQL has a pretty good track record for not inventing weird numbers with no real meaning (sadly, not so much the developers). Having said that, pg_merge_action() doesn't feel too bad if the syntax issues can be worked out.
>

I've been playing around a little with per-action RETURNING lists, and
attached is a working proof-of-concept (no docs yet).

The implementation is simplified a little by not needing special merge
support functions, but overall this approach introduces a little more
complexity, which is perhaps not surprising.

One fiddly part is resolving the shift/reduce conflicts in the
grammar. Specifically, on seeing "RETURNING expr when ...", there is
ambiguity over whether the "when" is a column alias or the start of
the next merge action. I've resolved that by assigning a slightly
higher precedence to an expression without an alias, so WHEN is
assumed to not be an alias. It seems pretty ugly though (in terms of
having to duplicate so much code), and I'd be interested to know if
there's a neater way to do it.

From a usability perspective, I'm still somewhat sceptical about this
approach. It's a much more verbose syntax, and it gets quite tedious
having to repeat the RETURNING list for every action, and keep them in
sync. I also note that other database vendors seem to have opted for
the single RETURNING list approach (not that we necessarily need to
copy them).

The patch enforces the rule that if any action has a RETURNING list,
they all must have a RETURNING list. Not doing that leads to the
number of rows returned not matching the command tag, or the number of
rows modified, which I think would just lead to confusion. Also, it
would likely be a source of easy-to-overlook mistakes. One such
mistake would be assuming that a RETURNING list at the very end
applied to all actions, though it would also be easy to accidentally
omit a RETURNING list in the middle of the command.

Having said that, I wonder if it would make sense to also support
having a RETURNING list at the very end, if there are no other
RETURNING lists. If we see that, we could automatically apply it to
all actions, which I think would be much more convenient in situations
where you don't care about the action executed, and just want the
results from the table. That would go a long way towards addressing my
usability concerns.

Regards,
Dean

Attachment Content-Type Size
POC-support-merge-returning-per-action.patch text/x-patch 77.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Victor Wagner 2023-10-27 15:00:51 Re: Enderbury Island disappeared from timezone database
Previous Message Tom Lane 2023-10-27 14:27:44 Re: btree_gin: Incorrect leftmost interval value