Re: MERGE SQL Statement for PG11

From: Nico Williams <nico(at)cryptonector(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: MERGE SQL Statement for PG11
Date: 2017-11-08 00:45:52
Message-ID: 20171108004546.GB4496@localhost
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Nov 07, 2017 at 03:31:22PM -0800, Peter Geoghegan wrote:
> On Tue, Nov 7, 2017 at 3:29 PM, Nico Williams <nico(at)cryptonector(dot)com> wrote:
> > On Thu, Nov 02, 2017 at 03:25:48PM -0700, Peter Geoghegan wrote:
> >> Nico Williams <nico(at)cryptonector(dot)com> wrote:
> >> >A MERGE mapped to a DML like this:
> >
> > I needed to spend more time reading MERGE docs from other RDBMSes.
>
> Please don't hijack this thread. It's about the basic question of
> semantics, and is already hard enough for others to follow as-is.

I'm absolutely not. If you'd like a pithy summary devoid of detail, it
is this:

I'm making the argument that using ON CONFLICT to implement MERGE
cannot produce a complete implementation [you seem to agree], but
there is at least one light-weight way to implement MERGE with
_existing_ machinery in PG: CTEs.

It's perfectly fine to implement an executor for MERGE, but I think
that's a bit silly and I explain why.

Further, I explored your question regarding order of events, which you
(and I) think is a very important semantics question. You thought order
of execution / trigger firing should be defined, whereas I think it
should not because MERGE explicitly says, at least MSFT's!

MSFT's MERGE says:

| For every insert, update, or delete action specified in the MERGE
| statement, SQL Server fires any corresponding AFTER triggers defined
| on the target table, but does not guarantee on which action to fire
| triggers first or last. Triggers defined for the same action honor the
| order you specify.

Impliedly (though not stated explicitly), the actual updates, inserts,
and deletes, can happen in any order as well as the triggers firing in
any order.

As usual, in the world of programming language design, leaving order of
execution undefined as much as possible increases the level of available
opportunities to parallelize. Presumably MSFT is leaving the door open
to parallizing MERGE, if they haven't already.

Impliedly, CTEs that have no dependencies on each other are also ripe
for parallelization. This is important too! For one of my goals is: to
improve CTE performance. If implementing MERGE as a mapping to CTEs
leads to improvements in CTEs, so much the better. But also this *is* a
simple implementation of MERGE, and simplicity seems like a good thing.

Nico
--

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Haribabu Kommi 2017-11-08 00:50:07 Re: Exclude pg_internal.init from base backup
Previous Message Michael Paquier 2017-11-08 00:11:00 Re: Exclude pg_internal.init from base backup