SQL MERGE is quite distinct from UPSERT

From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: SQL MERGE is quite distinct from UPSERT
Date: 2014-07-20 04:55:19
Message-ID: CAM3SWZRP0c3g6+aJ=YYDGYAcTZg0xA8-1_FCVo5Xm7hrEL34kw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Someone asked me privately why we weren't immediately pursuing SQL
MERGE, as opposed to yet another non-standard variant of UPSERT
(UPSERT is loosely defined here as an insert-or-update DML command
that goes to update based on would-be unique constraint violations,
and does one or the other *atomically*). I believe at least one other
system (Teradata) that now has both formerly only had the latter. I
think it makes sense to have both. While I believe that it's already
generally accepted that SQL MERGE is distinct from UPSERT in fairly
fundamental ways, in that it solves a somewhat distinct set of
problems, I would like to put my thoughts on the matter on record.
This is mostly for the sake of the archives.

The first reason why the two are distinct is that MERGE, as
implemented within other major systems and as described by the SQL
standard does not require an atomic insert-or-update for even the
simplest possible case of OLTP style insert-or-update. As I went into
during my talk at pgCon, even these simple cases may throw errors on
other systems' MERGE implementations. This leeway to fail may be
valuable for some use cases though, like the bulk loading use case.
Simon Riggs has characterized MERGE as something that exists for the
benefit of those two fairly distinct use cases [1], and I agree with
him. I suspect that the predictability of MERGE's behavior suffers due
to trying to support both at once, and I think that MERGE should be
specifically promoted as a bulk loading command if we ever get around
to adding it to PostgreSQL.

The second reason they differ is that SQL MERGE as implemented within
other systems doesn't require that a unique index be defined on the
column or columns of the MERGE statement's outer join. I am not
enthusiastic about the idea of falling back to table-level locking
when an appropriate index isn't available. That could be a foot-gun
for the large majority of users that just want to solve their basic
insert-or-update problem. I did think to mention this when asked about
my work on UPSERT, but the next question was: "So why not just throw
an error when it's not possible to use an available unique index"?
This is a fair point, but doesn't address all the problems with trying
to make MERGE perform upserts that meet my requirements for UPSERT
[2](principally the requirement of atomicity in the sense of always
getting an insert or update).

These two points, particularly the first are likely to weigh upon the
implementation of both, which is why we cannot very well just insist
upon having appropriate unique indexes defined. It buys my UPSERT
implementation plenty to have the upsert driven by an insert into a
unique index. With UPSERT, having an insert occur is always an outcome
that the DML statement's author will consider acceptable. SQL MERGE
statements can be written without any WHEN NOT MATCHED THEN INSERT;
SQL MERGE statement authors are not necessarily going to expect an
insert at all.

As I outlined in my pgCon talk, there appears to be a fundamental
trade-off involved when implementing UPSERT. My implementation is
prepared for the possibility that upon finding a duplicate, and upon
subsequently attempting to lock a row, the row may be found to be
deleted. When that happens, we try again without having made any
useful progress in the first iteration. The ideal of a strict queue of
upserters, with totally fair row locking style lock arbitration
appears to be basically impossible, at least provided you're unwilling
to give up insert-or-update atomicity (that is, to get this you have
to be willing to give up with an error even at READ COMMITTED, which
is a cop out, or buy into using something like table level locks,
which is also a cop out). Theoretical lock starvation hazards exist in
my implementation because I don't do this, but all indications are
that this is fine in the real world. In this regard my implementation
is similar to the current looping upsert subtransaction pattern we've
been recommending for years [3]. We don't hold on to "value locks" on
values within indexes (except to finish staggered index-locks-first
inserting - but not to update/row lock).

This is all fine in the real world because *somebody* has to make
progress when there is highly contended upserting. Some session has to
insert-or-update successfully - otherwise you cannot have a conflict
to begin with. Livelocking is therefore not possible, and you need a
*sustained* perfect storm of conflicts starving out one session in
order to see lock starvation at all. SQL MERGE has an outer join on a
table of proposed values (as opposed to an inner join against already
rejected values). MERGE allows you to do something other than insert
WHEN NOT MATCHED (i.e. to do nothing), and I think that being able to
always insert at that juncture gives us a way to usefully terminate
the loop where there might not be another way.

Suppose you had a similar "keep retrying" implementation (similar to
what I've proposed for UPSERT), but built this for SQL MERGE. Ignoring
the fact that no other SQL MERGE implementation has
insert-or-update-or-delete atomicity, you insist upon this for your
implementation. Consider how this would behave with a MERGE statement
that had only a WHEN MATCHED THEN UPDATE clause and/or a WHEN MATCHED
THEN DELETE clause. You need to do nothing for the WHEN NOT MATCHED
case, but how exactly do you decide when and how to do nothing when
it's based on some index value *not* being there? This isn't the same
as INSERT IGNORE - it's the exact opposite. We're doing nothing
because there was no conflict found, *not* because there was some
conflict conclusively found. Furthermore, not only are we doing
nothing, we're specifically not doing something else (updating). We're
not doing updating in a way predicated on the physical absence of a
value from an index at one instant in time. But what on earth does
that absence tell us? In what sense is the value not there? Is it in
the heap, but not yet in any index? Is it visible to our estate
snapshot? My sense is that any attempt to resolve these questions
would be extremely ticklish, and would likely introduce livelock
hazards or deadlock hazards. You end up with head melting
Heisenberg-esque questions when you try to impose this requirement on
SQL MERGE. AFAICT that's why no one else has succeeded in doing so. I
think you're just supposed to live with those problems ("Nothing
visible to my MVCC snapshot? Good enough, do WHEN NOT MATCHED handling
then"). A lot of the time these problems don't come up.

To state it another way, my implementation relies on something
*conclusively* being there (conclusively committed and visible to new
snapshots) - our new index tuple, or someone else's existing one. If
it's the latter case we're finally done when we conclusively lock the
tuple without a conflict (that is, without it being deleted in the
interim). We rely on things happening conclusively (things that have
locking implications, like inserting an index tuple or locking the
row). We cannot conclusively conclude that something is not there
without extreme measures that are very questionable.

At a high level SQL MERGE is quite distinct from UPSERT, in that it is
a utility command that performs inserts, updates and deletes while
avoiding race conditions (e.g. unique constraint violations) on a more
or less best effort basis. MERGE is conceptually messy. In contrast
UPSERT is actually atomic, and having its behavior be relatively easy
to reason about ought to be the top priority. There is a *really* big
demand for UPSERT from users, not MERGE, although MERGE is certainly
useful too.

[1] http://www.postgresql.org/message-id/1132000496.3388.31.camel@holly
[2] http://www.pgcon.org/2014/schedule/attachments/327_upsert_weird.pdf,
Slide 8, "Goals for UPSERT in Postgres"
[3] http://www.postgresql.org/docs/current/static/plpgsql-control-structures.html#PLPGSQL-UPSERT-EXAMPLE
Peter Geoghegan


Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2014-07-20 05:44:38 Re: [bug fix] pg_ctl always uses the same event source
Previous Message Peter Geoghegan 2014-07-20 02:04:32 Re: Making joins involving ctid work for the benefit of UPSERT