Re: MERGE SQL Statement for PG11

From: Nico Williams <nico(at)cryptonector(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: MERGE SQL Statement for PG11
Date: 2017-11-07 23:29:22
Message-ID: 20171107232920.GZ4496@localhost
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Nov 02, 2017 at 03:25:48PM -0700, Peter Geoghegan wrote:
> Nico Williams <nico(at)cryptonector(dot)com> wrote:
> >A MERGE mapped to a DML like this:

I needed to spend more time reading MERGE docs from other RDBMSes.

The best MERGE so far is MS SQL Server's, which looks like:

MERGE INTO <target> <target_alias>
USING <source> <source_alias>
ON (<join condition>)

-- optional:
WHEN MATCHED THEN UPDATE SET ...

-- optional:
WHEN NOT MATCHED [ BY TARGET ] THEN INSERT ...

-- optional:
WHEN NOT MATCHED BY SOURCE THEN DELETE

-- optional:
OUTPUT ...

;

(The other MERGEs are harder to use because they lack a WHEN NOT MATCHED
BY SOURCE THEN DELETE, instead having a DELETE clause on the UPDATE,
which is then difficult to use.)

This is *trivial* to map to a CTE, and, in fact, I and my colleagues
have resorted to hand-coded CTEs like this precisely because PG lacks
MERGE (though we ourselves didn't know about MERGE -- it's new to us).

If <source> is a query, then we start with a CTE for that, else if it's
a view or table, then we don't setup a CTE for it. Each of the UPDATE,
INSERT, and/or DELETE can be it's own CTE. If there's an OUTPUT clause,
that can be a final SELECT that queries from the CTEs that ran the DMLs
with RETURNING. If there's no OUTPUT then none of the DMLs need to have
RETURNING, and one of them will be the main statement, rather than a
CTE.

The pattern is:

WITH
-- IFF <source> is a query:
<source_alias> AS (<source>),

-- IFF there's a WHEN MATCHED THEN UPDATE
updates AS (
UPDATE <target> AS <target_alias> SET ...
FROM <source>
WHERE <join_condition>
-- IFF there's an OUTPUT clause, then:
RETURNING 'update' as "@action", ...
),

inserts AS (
INSERT INTO <target> (<column_list>)
SELECT ...
FROM <source>
LEFT JOIN <target> ON <join_condition>
WHERE <target> IS NOT DISTINCT FROM NULL
-- IFF there's a CONCURRENTLY clause:
ON CONFLICT DO NOTHING
-- IFF there's an OUTPUT clause, then:
RETURNING 'insert' as "@action", ...
),

deletes AS (
DELETE FROM <target>
WHERE NOT EXISTS (SELECT * FROM <source> WHERE <join_condition>)
-- IFF there's an OUTPUT clause, then:
RETURNING 'delete' as "@action", ...
),

-- IFF there's an OUTPUT clause
SELECT * FROM updates
UNION
SELECT * FROM inserts
UNION
SELECT * FROM deletes;

If there's not an output clause then one of the DMLs has to be the main
statement:

WITH ...
DELETE ...; -- or UPDATE, or INSERT

Note that if the source is a view or table and there's no OUTPUT clause,
then it's one DML with up to (but not referring to) two CTEs, and in all
cases the CTEs do not refer to each other. This means that the executor
can parallelize all of the DMLs.

If the source is a query, then that could be made a temp view to avoid
having to run the query first. The CTE executor needs to learn to
sometimes do this anyways, so this is good.

The <deletes> CTE can be equivalently written without a NOT EXISTS:

to_be_deleted AS (
SELECT <target>
FROM <target>
LEFT JOIN <source> ON (<join_condition>)
WHERE <source> IS NOT DISTINCT FROM NULL
),
deletes AS (
DELETE FROM <target>
USING to_be_deleted tbd
WHERE <target> = <tbd>
)

if that were to run faster (probably not, since PG today would first run
the to_be_deleted CTE, then the deletes CTE). I mention only because
it's nice to see the symmetry of LEFT JOINs for the two WHEN NOT MATCHED
cases.

(Here <source> is the alias for it if one was given.)

***

This mapping triggers triggers as one would expect (at least FOR EACH
ROW; I expect the DMLs in CTEs should also trigger FOR EACH STATEMENT
triggers, and if they don't I consider that a bug).

> This is a bad idea. An implementation like this is not at all
> maintainable.

I beg to differ. First of all, not having to add an executor for MERGE
is a win: much, much less code to maintain. The code to map MERGE to
CTEs can easily be contained entirely in src/backend/parser/gram.y,
which is a maintainability win: any changes to how CTEs are compiled
will fail to compile if they break the MERGE mapping to CTEs.

> >can handle concurrency via ON CONFLICT DO NOTHING in the INSERT CTE.
>
> That's not handling concurrency -- it's silently ignoring an error. Who
> is to say that the conflict that IGNORE ignored is associated with a row
> visible to the MVCC snapshot of the statement? IOW, why should the DELETE
> affect any row?

That was me misunderstanding MERGE. The DELETE is independent of the
INSERT -- if an INSERT does nothing because of an ON CONFLICT DO
NOTHING clause, then that won't cause that row to be deleted -- the
inserts and deletes CTEs are independent in the latest mapping (see
above).

I believe adding ON CONFLICT DO NOTHING to the INSERT in this mapping is
all that's needed to support concurrency.

> There are probably a great many reasons why you need a ModifyTable
> executor node that keeps around state, and explicitly indicates that a
> MERGE is a MERGE. For example, we'll probably want statement level
> triggers to execute in a fixed order, regardless of the MERGE, RLS will
> probably require explicitly knowledge of MERGE semantics, and so on.

Let's take those examples one at a time:

- Is there a reason to believe that MERGE could not parallelize the
DMLs it implies?

If they can be parallelized, then we should not define the order in
which the corresponding triggers fire.

Surely we want to leave that possibility (parallelization) open,
rather than exclude it.

The user should not depend on the order in which the FOR EACH
STATEMENT and FOR EACH ROW triggers will fire. They can always check
at the end of the transaction with DEFERRED triggers (see also my
patch for ALWAYS DEFERRED constraints and triggers).

AFTER <op> FOR EACH STATEMENT triggers will only run after all the
corresponding DMLs in the mapping have completed, but their relative
orders should still not be defined.

- I don't see how RLS isn't entirely orthogonal.

RLS would (does) apply as normal to all of the DMLs in the mapping.

If that was not the case, then there'd be a serious bug in PG right
now! Using a CTE must *not* disable RLS.

FOR UPDATE RLS policies are broken, however, since they don't get to
see the OLD and NEW values. But that's orthogonal here.

> FWIW, your example doesn't actually have a source (just a target), so it
> isn't actually like MERGE.

That was my mistake -- as I say above, I had to spend more time with the
various RDBMSes' MERGE docs.

Nico
--

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nico Williams 2017-11-07 23:30:03 Re: [PATCH] Add ALWAYS DEFERRED option for constraints
Previous Message Michael Paquier 2017-11-07 23:21:49 Re: Exclude pg_internal.init from base backup