Re: MERGE ... RETURNING

From: Gurjeet Singh <gurjeet(at)singh(dot)im>
To: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Vik Fearing <vik(at)postgresfriends(dot)org>, Isaac Morland <isaac(dot)morland(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: MERGE ... RETURNING
Date: 2023-07-06 05:12:46
Message-ID: CABwTF4U_jm-hVPKHc0Rxpv+ayNfWWhQyObkEjpY6EAhi72M7Ow@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Jul 1, 2023 at 4:08 AM Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> wrote:
>
> On Mon, 13 Mar 2023 at 13:36, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> wrote:
> >
> > And another rebase.
> >
>
> I ran out of cycles to pursue the MERGE patches in v16, but hopefully
> I can make more progress in v17.
>
> Looking at this one with fresh eyes, it looks mostly in good shape.

+1

Most of the review was done with the v6 of the patch, minus the doc
build step. The additional changes in v7 of the patch were eyeballed,
and tested with `make check`.

> To
> recap, this adds support for the RETURNING clause in MERGE, together
> with new support functions pg_merge_action() and
> pg_merge_when_clause() that can be used in the RETURNING list of MERGE

nit: s/can be used in/can be used only in/

> to retrieve the kind of action (INSERT/UPDATE/DELETE), and the index
> of the WHEN clause executed for each row merged. In addition,
> RETURNING support allows MERGE to be used as the source query in COPY
> TO and WITH queries.
>
> One minor annoyance is that, due to the way that pg_merge_action() and
> pg_merge_when_clause() require access to the MergeActionState, they
> only work if they appear directly in the RETURNING list. They can't,
> for example, appear in a subquery in the RETURNING list, and I don't
> see an easy way round that limitation.

I believe that's a serious limitation, and would be a blocker for the feature.

> Attached is an updated patch with some cosmetic updates, plus updated
> ruleutils support.

With each iteration of your patch, it is becoming increasingly clear
that this will be a documentation-heavy patch :-)

I think the name of function pg_merge_when_clause() can be improved.
How about pg_merge_when_clause_ordinal().

In the doc page of MERGE, you've moved the 'with_query' from the
bottom of Parameters section to the top of that section. Any reason
for this? Since the Parameters section is quite large, for a moment I
thought the patch was _adding_ the description of 'with_query'.

+ /*
+ * Merge support functions should only be called directly from a MERGE
+ * command, and need access to the parent ModifyTableState. The parser
+ * should have checked that such functions only appear in the RETURNING
+ * list of a MERGE, so this should never fail.
+ */
+ if (IsMergeSupportFunction(funcid))
+ {
+ if (!state->parent ||
+ !IsA(state->parent, ModifyTableState) ||
+ ((ModifyTableState *) state->parent)->operation != CMD_MERGE)
+ elog(ERROR, "merge support function called in non-merge context");

As the comment says, this is an unexpected condition, and should have
been caught and reported by the parser. So I think it'd be better to
use an Assert() here. Moreover, there's ERROR being raised by the
pg_merge_action() and pg_merge_when_clause() functions, when they
detect the function context is not appropriate.

I found it very innovative to allow these functions to be called only
in a certain part of certain SQL command. I don't think there's a
precedence for this in Postgres; I'd be glad to learn if there are
other functions that Postgres exposes this way.

- EXPR_KIND_RETURNING, /* RETURNING */
+ EXPR_KIND_RETURNING, /* RETURNING in INSERT/UPDATE/DELETE */
+ EXPR_KIND_MERGE_RETURNING, /* RETURNING in MERGE */

Having to invent a whole new ParseExprKind enum, feels like a bit of
an overkill. My only objection is that this is exactly like
EXPR_KIND_RETURNING, hence EXPR_KIND_MERGE_RETURNING needs to be dealt
with exactly in as many places. But I don't have any alternative
proposals.

--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
+/* Is this a merge support function? (Requires fmgroids.h) */
+#define IsMergeSupportFunction(funcid) \
+ ((funcid) == F_PG_MERGE_ACTION || \
+ (funcid) == F_PG_MERGE_WHEN_CLAUSE)

If it doesn't cause recursion or other complications, I think we
should simply include the fmgroids.h in pg_proc.h. I understand that
not all .c files/consumers that include pg_proc.h may need fmgroids.h,
but #include'ing it will eliminate the need to keep the "Requires..."
note above, and avoid confusion, too.

--- a/src/test/regress/expected/merge.out
+++ b/src/test/regress/expected/merge.out

-WHEN MATCHED AND tid > 2 THEN
+WHEN MATCHED AND tid >= 2 THEN

This change can be treated as a bug fix :-)

+-- COPY (MERGE ... RETURNING) TO ...
+BEGIN;
+COPY (
+ MERGE INTO sq_target t
+ USING v
+ ON tid = sid
+ WHEN MATCHED AND tid > 2 THEN

For consistency, the check should be tid >= 2, like you've fixed in
command referenced above.

+BEGIN;
+COPY (
+ MERGE INTO sq_target t
+ USING v
+ ON tid = sid
+ WHEN MATCHED AND tid > 2 THEN
+ UPDATE SET balance = t.balance + delta
+ WHEN NOT MATCHED THEN
+ INSERT (balance, tid) VALUES (balance + delta, sid)
+ WHEN MATCHED AND tid < 2 THEN
+ DELETE
+ RETURNING pg_merge_action(), t.*
+) TO stdout;
+DELETE 1 100
+ROLLBACK;

I expected the .out file to have captured the stdout. I'm gradually,
and gladly, re-learning bits of the test infrastructure.

The DELETE command tag in the output does not feel appropriate for a
COPY command that's using MERGE as the source of the data.

+CREATE FUNCTION merge_into_sq_target(sid int, balance int, delta int,
+ OUT action text, OUT tid int,
OUT new_balance int)
+LANGUAGE sql AS
+$$
+ MERGE INTO sq_target t
+ USING (VALUES ($1, $2, $3)) AS v(sid, balance, delta)
+ ON tid = v.sid
+ WHEN MATCHED AND tid > 2 THEN

Again, for consistency, the comparison operator should be >=. There
are a few more occurrences of this comparison in the rest of the file,
that need the same treatment.

Best regards,
Gurjeet
http://Gurje.et

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2023-07-06 05:28:57 Re: Add index scan progress to pg_stat_progress_vacuum
Previous Message Amit Kapila 2023-07-06 03:56:37 Re: [PATCH] Reuse Workers and Replication Slots during Logical Replication