Re: [HACKERS] MERGE SQL Statement for PG11

From: Andres Freund <andres(at)anarazel(dot)de>
To: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Geoghegan <pg(at)bowt(dot)ie>, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>, Pavan Deolasee <pavan(dot)deolasee(at)2ndquadrant(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] MERGE SQL Statement for PG11
Date: 2018-04-05 20:00:03
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 2018-04-05 11:31:48 +0530, Pavan Deolasee wrote:
> > +/*---------------------------------------------------------
> > ----------------
> > + *
> > + * nodeMerge.c
> > + * routines to handle Merge nodes relating to the MERGE command
> >
> > Isn't this file misnamed and it should be execMerge.c? The rest of the
> > node*.c files are for nodes that are invoked via execProcnode /
> > ExecProcNode(). This isn't an actual executor node.
> >
> Makes sense. Done. (Now that the patch is committed, I don't know if there
> would be a rethink about changing file names. May be not, just raising that
> concern)

It absolutely definitely needed to be renamed. But that's been done, so

> > What's the reasoning behind making this be an anomaluous type of
> > executor node?

> Didn't quite get that. I think naming of the file was bad (fixed now), but
> I think it's a good idea to move the new code in a new file, from
> maintainability as well as coverage perspective. If you've something very
> different in mind, can you explain in more details?

Well, it was kinda modeled as an executor node, including the file
name. That's somewhat fixed now. I still am extremely suspicious of
the codeflow here.

My impression is that this simply shouldn't be going through
nodeModifyTuple, but be it's own nodeMerge node. The trigger handling
would need to be abstraced into execTrigger.c or such to avoid
duplication. We're now going from nodeModifyTable.c:ExecModifyTable()
into execMerge.c:ExecMerge(), back to
nodeModifyTable.c:ExecUpdate/Insert(). To avoid ExecInsert() doing
things that aren't appropriate for merge, we then pass an actionState,
which neuters part of ExecUpdate/Insert(). This is just bad.

I think this should be cleaned up before it can be released, which means
this feature should be reverted and cleaned up nicely before being
re-committed. Otherwise we'll sit on this bad architecture and it'll
make future features harder. And if somebody cleans it up Simon will
complain that things are needlessly being destabilized (hello xlog.c
with it's 1000+ LOC functions).

> > + /*
> > + * If there are not WHEN MATCHED actions, we are done.
> > + */
> > + if (mergeMatchedActionStates == NIL)
> > + return true;
> >
> > Maybe I'm confused, but why is mergeMatchedActionStates attached to
> > per-partition info? How can this differ in number between partitions,
> > requiring us to re-check it below fetching the partition info above?
> >
> >
> Because each partition may have a columns in different order, dropped
> attributes etc. So we need to give treatment to the quals/targetlists. See
> ON CONFLICT DO UPDATE for similar code.

But the count wouldn't change, no? So we return before building the
partition info if there's no MATCHED action?

> > It seems fairly bad architecturally to me that we now have
> > EvalPlanQual() loops in both this routine *and*
> > ExecUpdate()/ExecDelete().
> >
> >
> This was done after review by Peter and I think I like the new way too.
> Also keeps the regular UPDATE/DELETE code paths least changed and let Merge
> handle concurrency issues specific to it.

It also makes the whole code barely readable. Minimal amount of changes
isn't a bad goal, but if the consequence of that is poor layering and
repeated code it's bad as well.

> > + /*
> > + * Initialize hufdp. Since the caller is only interested in the failure
> > + * status, initialize with the state that is used to indicate
> > successful
> > + * operation.
> > + */
> > + if (hufdp)
> > + hufdp->result = HeapTupleMayBeUpdated;
> > +
> >
> > This signals for me that the addition of the result field to HUFD wasn't
> > architecturally the right thing. HUFD is supposed to be supposed to be
> > returned by heap_update(), reusing and setting it from other places
> > seems like a layering violation to me.

> I am not sure I agree. Sure we can keep adding more parameters to
> ExecUpdate/ExecDelete and such routines, but I thought passing back all
> information via a single struct makes more sense.

You can just wrap HUFD in another struct that has the necessary

> > +
> > + /*
> > + * Add a whole-row-Var entry to support references to "source.*".
> > + */
> > + var = makeWholeRowVar(rt_rte, rt_rtindex, 0, false);
> > + te = makeTargetEntry((Expr *) var, list_length(*mergeSourceTargetList)
> > + 1,
> > + NULL, true);
> >
> > Why can't this properly dealt with by transformWholeRowRef() etc?
> >
> >
> I just followed ON CONFLICT style. May be there is a better way, but not
> clear how.

What code are you referring to?

> > Why is this, and not building a proper executor node for merge that
> > knows how to get the tuples, the right approach? We did a rough
> > equivalent for matview updates, and I think it turned out to be a pretty
> > bad plan.
> >
> >
> I am still not sure why that would be any better. Can you explain in detail
> what exactly you've in mind and how's that significantly better than what
> we have today?

Because the code flow would be understandable, we'd have proper parser
locations, we'd not have to introduce a new query source type, the
deparsing would be less painful, we'd not have to optimizer like hijinks
in parse analysis etc. In my opinion you're attempting to do planner /
optimizer stuff at the parse-analysis level, and that's just not a good


Andres Freund

In response to


Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2018-04-05 20:02:09 Re: Vacuum: allow usage of more than 1GB of work mem
Previous Message Jesper Pedersen 2018-04-05 19:47:12 Re: [HACKERS] Runtime Partition Pruning