Re: [HACKERS] MERGE SQL Statement for PG11

From: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>, Pavan Deolasee <pavan(dot)deolasee(at)2ndquadrant(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] MERGE SQL Statement for PG11
Date: 2018-03-21 12:23:47
Message-ID: CABOikdPhnYSyn18Rw_g37f5HXsVtn0HpjRCJnOzvOmydrDmF_A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Peter,

Thanks for the additional comments.

On Wed, Mar 21, 2018 at 2:17 AM, Peter Geoghegan <pg(at)bowt(dot)ie> wrote:

>
>
> * Why is terminating semi-colon on its own line in MERGE sql examples
> from the docs?
>

I checked a few other places and didn't find another example which puts
semi-colon on its own new line. So fixed per your suggestion.

>
> * ON CONFLICT references in MERGE docs should be a "Tip" box, IMV.
>

Changed.

>
> * Docs don't actually say what DO NOTHING does.
>

Added a para about that. In passing, I noticed that even though the grammar
and the docs agree that DO NOTHING is only supported for WHEN NOT MATCHED,
ExecMergeMatched() was unnecessarily checking for DO NOTHING. So fixed the
code.

>
> * The docs say "A condition cannot contain subqueries, set returning
> functions, nor can it contain window or aggregate functions". Thought
> it can now?
>

Yes, it now supports sub-queries. What about set-returning, aggregates etc?
I assume they are not supported in other places such as WHERE conditions
and JOIN quals. So they will continue to remain blocked even in WHEN
conditions. Do you think it's worth mentioning or we should not mention
anything at all?

>
> * The docs say "INSERT actions cannot contain sub-selects". Didn't that
> change?
>

No, it did not. We only support VALUES clause with INSERT action.

>
> * The docs on merge_insert grammar element say "Do not include the
> table name", which seems like it was already covered. It's obvious.
> Suggest removing it.
>

Ok, fixed.

>
> * The docs say "The number of rows inserted, updated and deleted can
> be seen in the output of EXPLAIN ANALYZE". This seems unnecessary. We
> don't see that for ON CONFLICT.
>

You mean the doc changes are unnecessary or the EXPLAIN ANALYZE output is
unnecessary? I assume the doc changes, but let me know if that's wrong
assumption.

>
> * The docs say "it should be noted that no error is raised if that
> occurs". But should it? I don't think that this addresses a perception
> that users are likely to have.
>

Again, do you mean we should raise error or just that the docs should not
mention anything about it? I don't think raising an error because the
candidate row did not meet any specified action is a good idea. May be some
day we add another option to store such rows in a separate temporary table.

>
> * "See MERGE" hyperlink within MERGE docs themselves seem odd.
> Suggest changing this.
>

Removed.

>
> * Trigger behavior doesn't belong in main MERGE doc page, IMV. Just
> like with ON CONFLICT, it should be documented where triggers are
> already documented. The specifics of MERGE can be discussed there.
>
>
Ok. I added couple of paras to trigger.sgml, but left merge.sgml untouched.
Suggestions for better wording are welcome.

> * It might make sense to point out in the docs that join_condition
> should not filter the target table too much. Like SQL server docs say,
> don't put things in the join that filter the target that actually
> belong in the WHEN .. AND quals. In a way, this should be obvious,
> because it's an outer join. But I don't think it is, and ISTM that the
> sensible thing to do is to warn against it.
>
>
Hmm, ok. Not sure how exactly to put that in words without confusing users.
Do you want to suggest something?

> * We never actually get around to saying that MERGE is good with bulk
> loading, ETL, and so on. I think that we should remark on that in
> passing.
>

Suggestion?

>
> * I think that the mvcc.sgml changes can go. Perhaps a passing
> reference to MERGE can be left behind, that makes it clear that it's
> really rather like UPDATE FROM and so on. The fact that it's like
> UPDATE FROM now seems crystal clear.
>
>
It seems useful to me. Should we move it to merge.sgml instead?

> * insert.sgml need not mention MERGE IMV.
>
>
Hmm. I am not sure. It seems worth leaving a reference there since MERGE
provides a new way to handle INSERTs.

>
>
> * Most of the field comments here should be improved:
>
> > /* ----------------
> > * MergeActionState information
> > * ----------------
> > */
> > typedef struct MergeActionState
> > {
> > NodeTag type;
> > bool matched; /* MATCHED or NOT MATCHED */
> > ExprState *whenqual; /* WHEN quals */
> > CmdType commandType; /* type of action */
> > TupleTableSlot *slot; /* instead of ResultRelInfo */
> > ProjectionInfo *proj; /* instead of ResultRelInfo */
> > } MergeActionState;
>
>
Done.

> * I wouldn't talk about a precedent like this:
>
> > > /*
> > > * Process BEFORE EACH STATEMENT triggers
> > > + *
> > > + * The precedent set by ON CONFLICT is that we fire INSERT then
> UPDATE.
> > > + * MERGE follows the same logic, firing INSERT, then UPDATE, then
> DELETE.
> > > */
>
> The comments should read as one voice. Ideally, it will look like ON
> CONFLICT could have just as easily been added after MERGE, unless
> there is a very compelling reason to mention a precedent. I mean, is
> there really any reason to think that the precedent set by ON CONFLICT
> actually made any difference? Is the suggestion here that there is a
> better behavior, that we cannot go with for historical reasons?
>
>
I agree. I removed those comments.

> * This ExecModifyTable() code seems a bit odd:
>
> >
> > if (operation == CMD_MERGE)
> > {
> > ExecMerge(node, estate, slot, junkfilter, resultRelInfo);
> > continue;
> > }
> >
> > tupleid = NULL;
> > oldtuple = NULL;
> > if (junkfilter != NULL)
> > {
> > /*
> > * extract the 'ctid' or 'wholerow' junk attribute.
> > */
> > if (operation == CMD_UPDATE ||
> > operation == CMD_DELETE ||
> > operation == CMD_MERGE)
> > {
>
> Why is there a test for CMD_MERGE if control flow cannot even reach
> there? What's going on here?
>

Just leftover bits from the previous code. Removed.

>
> * No need to say "not planned" here, I think:
>
> > +typedef struct MergeAction
> > +{
> > + NodeTag type;
> > + bool matched; /* MATCHED or NOT MATCHED */
> > + Node *condition; /* conditional expr (raw parser) */
> > + Node *qual; /* conditional expr
> (transformWhereClause) */
> > + CmdType commandType; /* type of action */
> > + Node *stmt; /* T_UpdateStmt etc - not planned */
> > + List *targetList; /* the target list (of ResTarget) */
> > +} MergeAction;
>
>
Fixed and also improved other comments for the struct.

> * Do tables with rules reject MERGE commands sensibly? We should have
> a test for that.
>

That check was indeed missing. Added the check and the test.

>
> * Logical decoding tests (test_decoding regression tests) seem like a
> good idea. This is much less important than with ON CONFLICT, since
> you don't have the significant complication of speculative insertions,
> but it seems like something to add on general principle.
>

Good point, added a test.

Thanks,
Pavan

--
Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
0002_merge_v23c_main.patch application/octet-stream 318.5 KB
0001_merge_v23c_onconflict_work.patch application/octet-stream 18.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Gustafsson 2018-03-21 12:24:22 Re: pg_basebackup: Missing newlines in some error messages
Previous Message Michael Banck 2018-03-21 12:12:45 pg_basebackup: Missing newlines in some error messages