Re: MERGE SQL Statement for PG11

From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: MERGE SQL Statement for PG11
Date: 2017-10-28 10:10:50
Message-ID: CANP8+j+FPEqpd8O3scwOKUgRvtWB5r5RhcCuJ8nn6VU_wc2Nng@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 28 October 2017 at 00:31, Michael Paquier <michael(dot)paquier(at)gmail(dot)com> wrote:
> On Fri, Oct 27, 2017 at 9:00 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> On Fri, Oct 27, 2017 at 4:41 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>>> I didn't say it but my intention was to just throw an ERROR if no
>>> single unique index can be identified.
>>>
>>> It could be possible to still run MERGE in that situaton but we would
>>> need to take a full table lock at ShareRowExclusive. It's quite likely
>>> that such statements would throw duplicate update errors, so I
>>> wouldn't be aiming to do anything with that for PG11.
>>
>> Like Peter, I think taking such a strong lock for a DML statement
>> doesn't sound like a very desirable way forward. It means, for
>> example, that you can only have one MERGE in progress on a table at
>> the same time, which is quite limiting. It could easily be the case
>> that you have multiple MERGE statements running at once but they touch
>> disjoint groups of rows and therefore everything works. I think the
>> code should be able to cope with concurrent changes, if nothing else
>> by throwing an ERROR, and then if the user wants to ensure that
>> doesn't happen by taking ShareRowExclusiveLock they can do that via an
>> explicit LOCK TABLE statement -- or else they can prevent concurrency
>> by any other means they see fit.
>
> +1, I would suspect users to run this query in parallel of the same
> table for multiple data sets.
>
> Peter has taken some time to explain me a bit his arguments today, and
> I agree that it does not sound much appealing to have constraint
> limitations for MERGE. Particularly using the existing ON CONFLICT
> structure gets the feeling of having twice a grammar for what's
> basically the same feature, with pretty much the same restrictions.
>
> By the way, this page sums up nicely the situation about many
> implementations of UPSERT taken for all systems:
> https://en.wikipedia.org/wiki/Merge_(SQL)

That Wikipedia article is badly out of date and regrettably does NOT
sum up the current situation nicely any more since MERGE has changed
in definition in SQL:2011 since its introduction in SQL:2003.

I'm proposing a MERGE statement for PG11 that
i) takes a RowExclusiveLock on rows, so can be run concurrently
ii) uses the ON CONFLICT infrastructure to do that
and so requires a unique constraint.

The above is useful behaviour that will be of great benefit to
PostgreSQL users. There are no anomalies remaining.

SQL:2011 specifically states "The extent to which an
SQL-implementation may disallow independent changes that are not
significant is implementation-defined”, so in my reading the above
behaviour would make us fully spec compliant. Thank you to Peter for
providing the infrastructure on which this is now possible for PG11.

Serge puts this very nicely by identifying two different use cases for MERGE.

Now, I accept that you might also want a MERGE statement that
continues to work even if there is no unique constraint, but it would
need to have different properties to the above. I do not in any way
argue against adding that. I also agree that adding RETURNING at a
later stage would be fine as well. I am proposing that those and any
other additional properties people come up with can be added in later
releases once we have the main functionality in core in PG11.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Chris Travers 2017-10-28 11:22:20 WIP: Restricting pg_rewind to data/wal dirs
Previous Message Robert Haas 2017-10-28 10:05:26 Re: Fix typo in blvacuum.c