Quick Links

Re: MERGE SQL Statement for PG11

From:	Nico Williams <nico(at)cryptonector(dot)com>
To:	Peter Geoghegan <pg(at)bowt(dot)ie>
Cc:	Simon Riggs <simon(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: MERGE SQL Statement for PG11
Date:	2017-11-02 20:32:24
Message-ID:	20171102203223.GT4496@localhost
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Thu, Nov 02, 2017 at 12:51:45PM -0700, Peter Geoghegan wrote:
> Nico Williams <nico(at)cryptonector(dot)com> wrote:
> >If you want to ignore conflicts arising from concurrency you could
> >always add an ON CONFLICT DO NOTHING to the INSERT DML in the mapping I
> >proposed earlier. Thus a MERGE CONCURRENTLY could just do that.
> >
> >Is there any reason not to map MERGE as I proposed?
>
> Performance, for one. MERGE generally has a join that can be optimized
> like an UPDATE FROM join.

Ah, right, I think my mapping was pessimal. How about this mapping
instead then:

WITH
updated AS (
UPDATE <target>
SET ...
WHERE <condition>
RETURNING <target>
)
, inserted AS (
INSERT INTO <target>
SELECT ...
WHERE <key> NOT IN (SELECT <key> FROM updated) AND ..
/*
* Add ON CONFLICT DO NOTHING here to avoid conflicts in the face
* of concurrency.
*/
RETURNING <target>
)
DELETE FROM <target>
WHERE <key> NOT IN (SELECT <key> FROM updated) AND
<key> NOT IN (SELECT <key> FROM inserted) AND ...;

If a MERGE has no delete clause, then the mapping would be:

WITH
updated AS (
UPDATE <target>
SET ...
WHERE <condition>
RETURNING <target>
)
INSERT INTO <target>
SELECT ...
WHERE <key> NOT IN (SELECT <key> FROM updated) AND ..
/*
* Add ON CONFLICT DO NOTHING here to avoid conflicts in the face
* of concurrency.
*/
;

> I haven't studied this question in any detail, but FWIW I think that
> using CTEs for merging is morally equivalent to a traditional MERGE
> implementation. [...]

I agree. So why not do that initially? Optimize later.

Such a MERGE mapping could be implemented entirely within
src/backend/parser/gram.y ...

Talk about cheap to implement, review, and maintain!

Also, this would be notionally very simple.

Any optimizations to CTE query/DML execution would be generic and
applicable to MERGE and other things besides. If mapping MERGE to
CTE-using DMLs motivates such optimizations, all the better.

> [...]. It may actually be possible to map from CTEs to a MERGE
> statement, but I don't think that that's a good approach to implementing
> MERGE.

Surely not every DML with CTEs can map to MERGE. Maybe I misunderstood
your comment?

> Most of the implementation time will probably be spent doing things like
> making sure MERGE behaves appropriately with triggers, RLS, updatable
> views, and so on. That will take quite a while, but isn't particularly
> technically challenging IMV.

Note that mapping to a DML with CTEs as above gets triggers, RLS, and
updateable views right from the get-go, because DMLs with CTEs, and DMLs
as CTEs, surely do as well.

Nico
--

In response to

Re: MERGE SQL Statement for PG11 at 2017-11-02 19:51:45 from Peter Geoghegan

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Simon Riggs	2017-11-02 20:36:00	Re: MERGE SQL Statement for PG11
Previous Message	Peter Eisentraut	2017-11-02 20:20:19	Re: [PATCH] Add ALWAYS DEFERRED option for constraints