Re: [HACKERS] MERGE SQL Statement for PG11

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>, Pavan Deolasee <pavan(dot)deolasee(at)2ndquadrant(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] MERGE SQL Statement for PG11
Date: 2018-03-20 20:47:40
Message-ID: CAH2-Wzmm0KLR-AHvp02FjZXrK2TMD5f8BHvnkyOZoYrmv+jJYg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Mar 8, 2018 at 6:59 PM, Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> Phew! Thanks for your patience and perseverance. I do have more
> feedback on the docs lined up, but that isn't so important right now.

I want to get this feedback on the docs to you now. This is a little
raw, as it's taken from notes made 2 weeks ago. I haven't checked if
these still points haven't been addressed already, so apologies if
you've already done something about some number of them.

* Why is terminating semi-colon on its own line in MERGE sql examples
from the docs?

* ON CONFLICT references in MERGE docs should be a "Tip" box, IMV.

* Docs don't actually say what DO NOTHING does.

* The docs say "A condition cannot contain subqueries, set returning
functions, nor can it contain window or aggregate functions". Thought
it can now?

* The docs say "INSERT actions cannot contain sub-selects". Didn't that change?

* The docs on merge_insert grammar element say "Do not include the
table name", which seems like it was already covered. It's obvious.
Suggest removing it.

* The docs say "The number of rows inserted, updated and deleted can
be seen in the output of EXPLAIN ANALYZE". This seems unnecessary. We
don't see that for ON CONFLICT.

* The docs say "it should be noted that no error is raised if that
occurs". But should it? I don't think that this addresses a perception
that users are likely to have.

* "See MERGE" hyperlink within MERGE docs themselves seem odd.
Suggest changing this.

* Trigger behavior doesn't belong in main MERGE doc page, IMV. Just
like with ON CONFLICT, it should be documented where triggers are
already documented. The specifics of MERGE can be discussed there.

* It might make sense to point out in the docs that join_condition
should not filter the target table too much. Like SQL server docs say,
don't put things in the join that filter the target that actually
belong in the WHEN .. AND quals. In a way, this should be obvious,
because it's an outer join. But I don't think it is, and ISTM that the
sensible thing to do is to warn against it.

* We never actually get around to saying that MERGE is good with bulk
loading, ETL, and so on. I think that we should remark on that in
passing.

* I think that the mvcc.sgml changes can go. Perhaps a passing
reference to MERGE can be left behind, that makes it clear that it's
really rather like UPDATE FROM and so on. The fact that it's like
UPDATE FROM now seems crystal clear.

* insert.sgml need not mention MERGE IMV.

I also have some additional minor feedback on the code itself that
I'll send now, which is a bit more recent (based on the version posted
on 2018-03-08):

* Most of the field comments here should be improved:

> /* ----------------
> * MergeActionState information
> * ----------------
> */
> typedef struct MergeActionState
> {
> NodeTag type;
> bool matched; /* MATCHED or NOT MATCHED */
> ExprState *whenqual; /* WHEN quals */
> CmdType commandType; /* type of action */
> TupleTableSlot *slot; /* instead of ResultRelInfo */
> ProjectionInfo *proj; /* instead of ResultRelInfo */
> } MergeActionState;

* I wouldn't talk about a precedent like this:

> > /*
> > * Process BEFORE EACH STATEMENT triggers
> > + *
> > + * The precedent set by ON CONFLICT is that we fire INSERT then UPDATE.
> > + * MERGE follows the same logic, firing INSERT, then UPDATE, then DELETE.
> > */

The comments should read as one voice. Ideally, it will look like ON
CONFLICT could have just as easily been added after MERGE, unless
there is a very compelling reason to mention a precedent. I mean, is
there really any reason to think that the precedent set by ON CONFLICT
actually made any difference? Is the suggestion here that there is a
better behavior, that we cannot go with for historical reasons?

* This ExecModifyTable() code seems a bit odd:

>
> if (operation == CMD_MERGE)
> {
> ExecMerge(node, estate, slot, junkfilter, resultRelInfo);
> continue;
> }
>
> tupleid = NULL;
> oldtuple = NULL;
> if (junkfilter != NULL)
> {
> /*
> * extract the 'ctid' or 'wholerow' junk attribute.
> */
> if (operation == CMD_UPDATE ||
> operation == CMD_DELETE ||
> operation == CMD_MERGE)
> {

Why is there a test for CMD_MERGE if control flow cannot even reach
there? What's going on here?

* No need to say "not planned" here, I think:

> +typedef struct MergeAction
> +{
> + NodeTag type;
> + bool matched; /* MATCHED or NOT MATCHED */
> + Node *condition; /* conditional expr (raw parser) */
> + Node *qual; /* conditional expr (transformWhereClause) */
> + CmdType commandType; /* type of action */
> + Node *stmt; /* T_UpdateStmt etc - not planned */
> + List *targetList; /* the target list (of ResTarget) */
> +} MergeAction;

* Do tables with rules reject MERGE commands sensibly? We should have
a test for that.

* Logical decoding tests (test_decoding regression tests) seem like a
good idea. This is much less important than with ON CONFLICT, since
you don't have the significant complication of speculative insertions,
but it seems like something to add on general principle.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2018-03-20 21:04:22 Re: [HACKERS] taking stdbool.h into use
Previous Message Thomas Munro 2018-03-20 20:44:45 Add "docs" to top-level Makefile for non-GNU make?