Re: [HACKERS] MERGE SQL Statement for PG11

From: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>, Pavan Deolasee <pavan(dot)deolasee(at)2ndquadrant(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] MERGE SQL Statement for PG11
Date: 2018-03-09 14:55:35
Message-ID: CABOikdNj+8HEJ5D8tu56mrPkjHVRrBb2_cdKWwpiYNcjXgDw8g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 9 March 2018 at 08:29, Peter Geoghegan <pg(at)bowt(dot)ie> wrote:

> On Thu, Mar 8, 2018 at 4:54 AM, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
> wrote:
> > Thanks for the feedback. I've greatly refactored the code based on your
> > comments and I too like the resultant code. What we have now have
> > essentially is: two functions:
>
> I really like how ExecMerge() now mostly just consists of this code
> (plus a lot of explanatory comments):
>

Thanks!

> My #1 concern has become RLS, and
> perhaps only because I haven't studied it in enough detail.

Sure. I've done what I thought is the right thing to do, but please check.
Stephen also wanted to review RLS related code; I don't know if he had
chance to do so.

> It seems
> like getting this patch committable is now a matter of verifying
> details. Of course, there are a lot of details to verify. :-)
>

Let's work through them together and hopefully we shall get there.

>
>
>
> * MERGE code needs to do cardinality violations like ON CONFLICT.
> Specifically, you need the TransactionIdIsCurrentTransactionId()
> check, and a second error fallback "attempted to lock invisible tuple"
> error (though this should say "attempted to update or delete invisible
> tuple" instead). The extra check is redundant, but better safe than
> sorry. I like defensive errors like this because it makes the code
> easier to read -- I don't get distracted by their absence.
>

Ok. Fixed. I also added the missing case for HeapTupleInvisible, though I
don't see how we can reach there. Since ExecUpdate()/ExecDelete() always
wait for any concurrent transaction to finish, I don't see how we can
reach HeapTupleBeingUpdated. So I skipped that test (or may be we should
just add an error for completeness).

>
> * The last item on cardinality violations also implies this new merge
> comment should be changed:
>
> > + /*
> > + * The target tuple was already updated or deleted
> by the
> > + * current command, or by a later command in the
> current
> > + * transaction.
> > + */
>
> It should be changed because it can't have been a later command in
> this xact. We handled that case when we called ExecUpdate or
> ExecDelete() (plus we now have the extra defensive "can't happen"
> elog() error).
>

Removed.

>
> * This related comment shouldn't even talk about update/historic
> behavior, now that it isn't in ExecUpdate() -- just MERGE:
>
> > + /*
> > + * The former case is possible in a join UPDATE where
> > + * multiple tuples join to the same target tuple.
> This is
> > + * pretty questionable, but Postgres has always
> allowed
> > + * it: we just execute the first update action and
> ignore
> > + * additional update attempts. SQLStandard
> disallows this
> > + * for MERGE, so allow the caller to select how to
> handle
> > + * this.
> > + */
>

Removed.

>
> * This wording should be tweaked:
>
> > + * If the current tuple is that last tuple in the
> update
> > + * chain, then we know that the tuple was
> concurrently
> > + * deleted. We just switch to NOT MATCHED case and
> let the
> > + * caller retry the NOT MATCHED actions.
>
> This should say something like "caller can move on to NOT MATCHED
> actions". They can never go back from there, of course, which I want
> us to be clear on.
>

Fixed, but please check if the new wording is OK.

>
> * This check of whether whenqual is set is unnecessary, and doesn't
> match MATCHED code, or the accompanying comments:
>
> > + /*
> > + * Test condition, if any
> > + *
> > + * In the absence of a condition we perform the action
> unconditionally
> > + * (no need to check separately since ExecQual() will return
> true if
> > + * there are no conditions to evaluate).
> > + */
> > + if (action->whenqual && !ExecQual(action->whenqual, econtext))
> > + continue;
>

Fixed.

>
> * I think that this ExecMerge() assertion is not helpful, since you go
> on to dereference the pointer in all cases anyway:
>
> > + Assert(junkfilter != NULL);
>

Removed.

>
> * Executor README changes, particularly about projecting twice, really
> should be ExecMerge() comments. Maybe just get rid of these?
>

Fixed, with addition to TABLEOID column that we now fetch along with CTID
column.

>
> * Why are we using CMD_NOTHING at all? That constant has something to
> do with user-visible rules, and there is no need to reuse it (make a
> new CMD_* if you have to). More importantly, why do we even have the
> corresponding DO NOTHING stuff in the grammar? Why would users want
> that?
>
> For quite a while, I thought that patch must have been support for ON
> CONFLICT DO NOTHING within MERGE INSERTs (the docs don't even say what
> DO NOTHING is). But that's not what it is at all. It seems like this
> is a way of having an action that terminates early, so you don't have
> to go on to evaluate other action quals.

Hmm. I was under the impression that the SQL Standards support DO NOTHING.
But now that I read the standards, I can't find any mention of DO NOTHING.
It might still be useful for users to skip all (matched/not-matched/both)
actions for some specific conditions. For example,

WHEN MATCHED AND balance > min_balance THEN
DO NOTHING
WHEN MATCHED AND custcat = 'priority' THEN
DO NOTHING
....

But I agree that if we remove DO NOTHING, we can simplify the code. Or we
can probably re-arrange the code to check DO NOTHING early and exit the
loop. That simplifies it a lot, as in attached. In passing, I also realised
that the target tuple may also be needed for DO NOTHING actions since they
may have quals attached to them. Also, the tuple should really be fetched
just once, not once per action. Those changes are done.

>
> * The docs say that we use a LEFT OUTER JOIN for MERGE, while the
> implementation uses a RIGHT OUTER JOIN because it's convenient for
> parse analysis finding the target. This difference is a bit confusing.
> Why say that we use any kind of join at all, though?
>

Hmm, right. In fact, we revert to INNER JOIN when there are no NOT MATCHED
actions, so probably we should just not mention any kind of joins,
definitely not in the user documentation.

>
> The SQL standard doesn't say outer join at all. Neither do the MERGE
> docs of the other database systems that I checked. Taking a position
> on this seems to add nothing at all. Let's make the MERGE docs refer
> to it as a join, without further elaboration.
>
> * I don't find this comment from analyze.c very helpful:
>
> > + * We don't have a separate plan for each action, so the when
> > + * condition must be executed as a per-row check, making it very
> > + * similar to a CHECK constraint and so we adopt the same
> semantics
> > + * for that.
>
> Why explain it that way at all? There are two rels, unlike a check
> constraint.
>

Agreed. I think it was left-over from the time when sub-selects were not
allowed and we were using EXPR_KIND_CHECK_CONSTRAINT to run those checks.
Now we have a separate expression kind and we do support sub-selects.

>
> * The first time I read this comment, it made me laugh:
>
> > + /*
> > + * First rule of MERGE club is we don't talk about rules
> > + */
>
> The joke has become significantly less funny since then, though. I'd
> just say that MERGE doesn't support rules, as it's unclear how that
> could work.
>

Changed that way.

>
> * This comment seems redundant, since pstate is always allocated with
> palloc0():
>
> + * Note: we assume that the pstate's p_rtable, p_joinlist, and p_namespace
> + * lists were initialized to NIL when the pstate was created.
>
> make_parsestate() knows about this. This is widespread, normal
> practice during parse analysis.
>

Ok. Removed.

>
> * Is this actually needed at all?:
>
> > + /* In MERGE when and condition, no system column is allowed */
> > + if (pstate->p_expr_kind == EXPR_KIND_MERGE_WHEN_AND &&
> > + attnum < InvalidAttrNumber &&
> > + !(attnum == TableOidAttributeNumber || attnum ==
> ObjectIdAttributeNumber))
> > + ereport(ERROR,
> > + (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
> > + errmsg("system column \"%s\" reference in WHEN AND
> condition is invalid",
> > + colname),
> > + parser_errposition(pstate, location)));
>
> We're talking about the scantuple here. It's not like excluded.*.
>

We might be able to support them, but do we really care?

>
> * Are these checks really needed?:
>
> > +void
> > +rewriteTargetListMerge(Query *parsetree, Relation target_relation)
> > +{
> > + Var *var = NULL;
> > + const char *attrname;
> > + TargetEntry *tle;
> > +
> > + if (target_relation->rd_rel->relkind == RELKIND_RELATION ||
> > + target_relation->rd_rel->relkind == RELKIND_MATVIEW ||
> > + target_relation->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
>
>
You're right. Those checks are done at the beginning and we should probably
just turn this into an Assert, just like ExecMerge(). Changed that way.

> * I think that you should remove this debug include:
>
> > index 3a02307bd9..ed4e857477 100644
> > --- a/src/backend/parser/parse_clause.c
> > +++ b/src/backend/parser/parse_clause.c
> > @@ -31,6 +31,7 @@
> > #include "commands/defrem.h"
> > #include "nodes/makefuncs.h"
> > #include "nodes/nodeFuncs.h"
> > +#include "nodes/print.h"
> > #include "optimizer/tlist.h"
> > #include "optimizer/var.h"
> > #include "parser/analyze.h"
>

Done.

>
> It may be worth considering custom dprintf code within your .gdbinit
> to do stuff like this automatically, without code changes that may end
> up in the patch you post to the list. Here is one that I sometimes use
> for tuplesort.c:
>
> define print_tuplesort_counts
> dprintf tuplesort_sort_memtuples, "memtupsize: %d, memtupcount:
> %d\n", state->memtupsize, state->memtupcount
> end
>
> This is way easier than adding custom printf style debug code, since
> it doesn't require that you rebuild. It would be safe to add a similar
> dprintf that called pprint() or similar when some function is reached.
>

Ok. Thanks for the hint.

>
> * find_mergetarget_parents() and find_mergetarget_for_rel() could both
> use at least a one-liner header comment.
>

Done. It was indeed confusing, so comments should help.

>
> * This analyze.c comment, which is in transformMergeStmt(), seems
> pretty questionable:
>
> > + /*
> > + * Simplify the MERGE query as much as possible
> > + *
> > + * These seem like things that could go into Optimizer, but they are
> > + * semantic simplications rather than optimizations, per se.
> > + *
> > + * If there are no INSERT actions we won't be using the non-matching
> > + * candidate rows for anything, so no need for an outer join. We do
> still
> > + * need an inner join for UPDATE and DELETE actions.
>
> This is talking about an ad-hoc form of join strength reduction. Yeah,
> it's a semantic simplification, but that's generally true of join
> strength reduction, which mostly exists because of bad ORMs that
> sprinkle OUTER on top of queries indifferently. MERGE is hardly an
> ideal candidate for join strength reduction, and I would just lose
> this code entirely for Postgres v11.
>

I am not sure if there could be additional query optimisation possibilities
when a RIGHT OUTER JOIN is changed to INNER JOIN. Apart from existing
optimisations, I am also thinking about some of the work that David and
Amit are doing for partition pruning and I wonder if the choice of join
might have a non-trivial effect. Having said that, I am yet to explore
those things. But when we definitely know that a different kind of JOIN is
all we need (because there are no NOT MATCHED actions), why not use that?
Or do you see a problem there?

>
> * This sounds really brittle, and so doesn't make sense even as an
> aspiration:
>
> > + * XXX if we were really keen we could look through the actionList
> and
> > + * pull out common conditions, if there were no terminal clauses and
> put
> > + * them into the main query as an early row filter but that seems
> like an
> > + * atypical case and so checking for it would be likely to just be
> wasted
> > + * effort.
> > + */
>

We might actually want to do this, if not for v11 then later. I am not sure
if we should keep it in the final commit though.
.

>
> * You should say *when* this happens (what later point):
>
> > + *
> > + * Track the RTE index of the target table used in the join query.
> This is
> > + * later used to add required junk attributes to the targetlist.
> > + */
>

Done.

>
> * This seems unnecessary, as we don't say anything like it for ON
> CONFLICT DO UPDATE:
>
> > + * As top-level statements INSERT, UPDATE and DELETE have a Query,
> whereas
> > + * with MERGE the individual actions do not require separate
> planning,
> > + * only different handling in the executor. See nodeModifyTable
> handling
> > + * of commandType CMD_MERGE.
>
>
I don't see anything wrong with keeping it. May be it needs rewording?

> * What does this mean?:
>
> > + * A sub-query can include the Target, but otherwise the sub-query
> cannot
> > + * reference the outermost Target table at all.
>

Don't know :-) I think what Simon probably meant to say that you can't
reference the Target table in the sub-query used in the source relation.
But you can add the Target as a separate RTE

>
> * I don't see the point in this:
>
> > + * XXX Perhaps we require Parallel Safety since that is a
> superset of
> > + * the restriction and enforcing that makes it easier to consider
> > + * running MERGE plans in parallel in future.
>
>
Yeah, removed.

> * This references the now-removed executor WAL check thing, so you'll
> need to remove it too:
>
> > + * SQL Standard says we should not allow anything that possibly
> > + * modifies SQL-data. We enforce that with an executor check
> that we
> > + * have not written any WAL.
>
>
Yes, removed too.

> * This should be reworded:
>
> > + * Note that we don't add this to the MERGE Query's quals because
> > + * that's not the logic MERGE uses.
> > + */
> > + action->qual = transformWhereClause(pstate, action->condition,
> > + EXPR_KIND_MERGE_WHEN_AND,
> "WHEN");
>
> Perhaps say "WHEN ... AND quals are evaluated separately from the
> MERGE statement's join quals" instead. Or just lose it altogether.
>
>
Reworded slightly.

> * This comment seems inaccurate:
>
> > + /*
> > + * Process INSERT ... VALUES with a single VALUES
> > + * sublist. We treat this case separately for
> > + * efficiency. The sublist is just computed
> directly
> > + * as the Query's targetlist, with no VALUES
> RTE. So
> > + * it works just like a SELECT without any FROM.
> > + */
>
> Wouldn't it be more accurate to say that this it totally different to
> what transformInsertStmt() does for a SELECT without any FROM? Also,
> do you think that that's good?
>

Actually it's pretty much same as what transformInsertStmt(). Even the
comments are verbatim copied from there. I don't see anything wrong with
the comment or the code. Can you please explain what problem you see?

>
> >> * Tests for transition table behavior (mixed INSERTs, UPDATEs, and
> >> DELETEs) within triggers.sql seems like a good idea.
> >
> > Ok, I will add. But not done in this version.
>
> Note that it's implied in at least one place that we don't support
> transition tables at all:
>
> > + /*
> > + * XXX if we support transition tables this would need
> to move
> > + * earlier before ExecSetupTransitionCaptureState()
> > + */
> > + switch (action->commandType)
> > + {
>
> You'll want to get to this as part of that transition table effort.
>

I actually didn't even know what transition tables are until today. But
today I studied them and the new version now supports transition tables
with MERGE. We might consider it to WIP given the amount of time I've spent
coding that, though I am fairly happy with the result so far. The comment
above turned out to be bogus.

I decided to create new tuplestores and mark them as new_update,
old_update, new_insert and old_delete. This is necessary because MERGE can
run all three kinds of commands i.e. UPDATE, DELETE and INSERT, and we
would like to track their transition tables separately.

(Hmm.. I just noticed though INSERT ON CONFLICT does not do this and still
able to track transition tables for INSERT and UPDATE correctly. So may be
what I did wasn't necessary after all, though it's also likely that we can
get IOC to use this new mechanism)

>
> Any plan to fix this/support identity columns? I see that you don't
> support them here:
>
> > + /*
> > + * Handle INSERT much like in transformInsertStmt
> > + *
> > + * XXX currently ignore stmt->override, if present
> > + */
>
>
I have already added support for OVERRIDING. The comment needed adjustments
which I have done now.

> I think that this is a blocker, unfortunately.
>

You mean OVERRIDING or ruleutils?

>
> >> * I still think we probably need to add something to ruleutils.c, so
> >> that MERGE Query structs can be deparsed -- see get_query_def().
> >
> > Ok. I will look at it. Not done in this version though.
>
> I also wonder what it would take to support referencing CTEs. Other
> implementations do have this. Why can't we?
>

Hmm. I will look at them. But TBH I would like to postpone them to v12 if
they turn out to be tricky. We have already covered a very large ground in
the last few weeks, but we're approaching the feature freeze date.

>
> Phew! Thanks for your patience and perseverance. I do have more
> feedback on the docs lined up, but that isn't so important right now.
>

Thanks! Those were really useful review comments. In passing, I made some
updates to the doc, but I really should make a complete pass over the patch.

Thanks,
Pavan

--
Pavan Deolasee http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
merge_v20.patch application/octet-stream 300.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2018-03-09 15:17:57 Re: [HACKERS] MERGE SQL Statement for PG11
Previous Message Sergei Kornilov 2018-03-09 14:51:25 Re: using index or check in ALTER TABLE SET NOT NULL