Re: support for MERGE

From: Zhihong Yu <zyu(at)yugabyte(dot)com>
To: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: Pg Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Simon Riggs <simon(dot)riggs(at)enterprisedb(dot)com>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Daniel Westermann <dwe(at)dbi-services(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Japin Li <japinli(at)hotmail(dot)com>, Erik Rijkers <er(at)xs4all(dot)nl>, Jaime Casanova <jcasanov(at)systemguards(dot)com(dot)ec>
Subject: Re: support for MERGE
Date: 2022-02-27 18:36:08
Message-ID: CALNJ-vRPRn731av4EFexBdTuFCPnaei-gM-aJE0trEJ6Z8c6XA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Feb 27, 2022 at 9:25 AM Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
wrote:

> I attach v12 of MERGE. Significant effort went into splitting
> ExecUpdate and ExecDelete into parts that can be reused from the MERGE
> routines in a way that doesn't involve jumping out in the middle of
> TM_Result processing to merge-specific EvalPlanQual handling. So in
> terms of code flow, this is much cleaner. It does mean that MERGE now
> has to call three routines instead of just one, but that doesn't seem a
> big deal.
>
> So now the main ExecUpdate first has to call ExecUpdatePrologue, then
> ExecUpdateAct, and complete with ExecUpdateEpilogue. In the middle of
> all this, it does its own thing to deal with foreign tables, and result
> processing for regular tables including EvalPlanQual. ExecUpdate has
> been split similarly.
>
> So MERGE now does these three things, and then its own result
> processing.
>
> Now, naively patching in that way results in absurd argument lists for
> these routines, so I introduced a new ModifyTableContext struct to carry
> the context for each operation. I think this is good, but perhaps it
> could stand some more improvement in terms of what goes in there and
> what doesn't. It took me a while to find an arrangement that really
> worked. (Initially I had resultRelInfo and the tuple slot and some
> flags for DELETE, but it turned out to be better to keep them out.)
>
> Regarding code arrangement: I decided that exporting all those
> ModifyTable internal functions was a bad idea, so I made it all static
> again. I think the MERGE routines are merely additional ModifyTable
> methods; trying to make it a separate executor node doesn't seem like
> it'd be an improvement. For now, I just made nodeModifyTable.c #include
> execMerge.c, though I'm not sure if moving all the code into
> nodeModifyTable.c would be a good idea, or whether we should create
> separate files for each of those methods. Generally speaking I'm not
> enthusiastic about creating an exported API of these methods. If we
> think nodeModifyTable.c is too large, maybe we can split it in smaller
> files and smash them together with #include "foo.c". (If we do expose
> it in some .h then the ModifyTableContext would have to be exported as
> well, which doesn't sound too exciting.)
>
>
> Sadly, because I've been spending a lot of time preparing for an
> international move, I didn't have time to produce a split patch that
> would first restructure nodeModifyTable.c making this more reviewable,
> but I'll do that as soon as I'm able. There is also a bit of a bug in a
> corner case of an UPDATE that involves a cross-partition tuple migration
> with another concurrent update. I was unable to figure out how to fix
> that, so I commented out the affected line from merge-update.spec.
> Again, I'll get that fixed as soon as the dust has settled here.
>
> There are some review points that are still unaddressed, such as
> Andres' question of what happens in case of a concurrent INSERT.
>
> Thanks
>
> --
> Álvaro Herrera 39°49'30"S 73°17'W —
> https://www.EnterpriseDB.com/
> "La fuerza no está en los medios físicos
> sino que reside en una voluntad indomable" (Gandhi)
>

Hi,

+ else if (node->operation == CMD_MERGE)
+ {
+ /* EXPLAIN ANALYZE display of actual outcome for each tuple
proposed */

I know the comment was copied. But it seems `proposed` should be
`processed`.

For ExecMergeNotMatched():

+ foreach(l, actionStates)
+ {
...
+ break;
+ }

If there is only one state in the List, maybe add an assertion for the
length of the actionStates List.

+ ModifyTableContext sc;

It seems the variable name can be more intuitive. How about naming it mtc
(or context, as what later code uses) ?

Cheers

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Gustafsson 2022-02-27 20:17:13 Re: support for MERGE
Previous Message Euler Taveira 2022-02-27 18:07:16 Re: Commitfest manager for 2022-03