Re: MERGE SQL Statement

From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: MERGE SQL Statement
Date: 2008-04-17 06:31:32
Message-ID: 1208413892.4259.244.camel@ebony.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, 2008-04-16 at 22:18 -0400, Robert Treat wrote:

> > * USING query can be a VALUES clause if we wish to do single/few row
> > operations, so MERGE can be used for bulk-loading and OLTP
> >
> > * There is no RETURNING option for MERGE, nor for any INSERT/UPDATE
> > sub-clauses
>
> Is there a reason for this? A returning for any insert/updated rows would be
> great, especially if your doing single row merges via the values clause.

It's non-standard, is the first. Second would be complexity and third
would be difficulty in doing this usefully since you'd probably also
want to know which WHEN clause fired and whether the result for that row
was an I, U or D. Perhaps a later extension, but certainly not first
shot.

> > * WHERE CURRENT OF cursor is not supported anywhere
> > * The join can't be recursive, so no WITH support (common expressions,
> > i.e. non-recursive WITH are supported by SQLServer 2008)
> > * conditions would not allow sub-selects
> >
>
> same question on both of these, is there some technical reason for this? I'd
> imagine people would like both of these options.
>
> > * MERGE would work on base tables only, just like COPY
> > * Changes are made only to that single table
> > * Cannot update a column mentioned in the ON clause cos that would make
> > my head hurt too much
> >
>
> Hmm... if the matching test is always done first, istm you could then update
> the columns that matched on, since you don't really care about the value of
> the on column once you have the row.

I think you can update all columns without difficulty.

> > * MERGE will perform a left outer join between source on left and target
> > on right. There must be no more than 1 row from table-ref for each row
> > in the table. Each row in the table can only be updated once during each
> > MERGE statement. Each non-matching row in the table-ref will result in
> > one INSERT into table.
> >
> > * WHEN clauses are tested in the order specified. If the AND condition
> > returns false then we skip onto the next WHEN clause. We stop once a
> > WHEN clause activates, so only *one* action is ever activated for each
> > row.
> >
> > * AND clauses need not form a complete set, i.e. it is possible that no
> > action will result. It is also possible that some WHEN clauses will
> > never activate because of the execution order; we would not try to
> > prevent this, just document it as a possible user error.
> >
>
> Just curious if any of these behaviors come from the spec? or maybe from
> other databases? they don't seem unreasonable in general though.

First two come from spec following clarifications from other
implementations. The last point about the AND clauses not necessarily
covering 100% of cases follows from the other points.

> > * MERGE will respect Triggers, but not Rules since the rules behaviour
> > is roughly orthogonal to the WHEN clauses
>
> Should there be a new rule option? ie. ON MERGE rules ?

Maybe, but not as part of this project.

> > * MERGE fires UPDATE and INSERT triggers according to which WHEN clause
> > is activated (if any)
> >
> > * It's unclear whether MERGE should activate statement-level triggers,
> > or not. Few of the above sources are explicit either way on this point.
> > DB2 always runs both UPDATE and INSERT statement-level triggers, whether
> > or not rows have been changed; I would suggest we do that also for
> > before triggers. For after statement triggers I would suggest that we
> > track how many updates and inserts are caused and if updates > 0 then we
> > activate the after statement for update triggers, and if inserts > 0
> > then we activate the after statement for insert triggers. If a statement
> > level trigger is activated by both update and insert then it would be
> > possible for both TRIGGER_FIRED_BY_UPDATE() and
> > TRIGGER_FIRED_BY_DELETE() to be true (for statement level triggers
> > only), which would be a change from what we do now, even if the old
> > behaviour was not explicitly mutually exclusive. In both cases I suggest
> > we run Update triggers before Insert triggers consistently for both
> > before and after statement triggers.
> >
>
> It would seem wierd that a before update statement level trigger would fire
> and not the matching after update statement level trigger. I can probably
> live with this behavior though, but seems like a note worth documenting.

Agreed. I think this aspect can be re-visited when we see all of the
test cases.

> > * The number of rows changed should be (inserts + updates) which should
> > be < number of rows returned by table-ref. It would be good to get
> > access to the number of rows inserted and updated, so I propose that we
> > return a NOTICE statement with this information.
> >
>
> How do you handle deletes? IE. If I merge two tables and I end up with no
> updates or inserts but 100 deletes, is the number of affected rows 0 ?

No, 100. I meant: add those as well.

> Nice work, hope my comments will be helpful.

Thanks; yes.

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Martijn van Oosterhout 2008-04-17 08:31:56 Re: Lessons from commit fest
Previous Message Heikki Linnakangas 2008-04-17 06:20:50 Re: Patch for Prevent pg_dump/pg_restore from being affected by statement_timeout