Re: ask for review of MERGE

From: Greg Smith <greg(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Martijn van Oosterhout <kleptog(at)svana(dot)org>, Greg Stark <gsstark(at)mit(dot)edu>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Marko Tiikkaja <marko(dot)tiikkaja(at)cs(dot)helsinki(dot)fi>, Boxuan Zhai <bxzhai2010(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: ask for review of MERGE
Date: 2010-10-24 16:15:54
Message-ID: 4CC45BBA.9080400@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert Haas wrote:
> I am also wondering exactly what the semantics are supposed to be
> under concurrency. We can't assume that it's OK to treat WHEN NOT
> MATCHED as WHEN MATCHED if a unique-key violation is encountered. The
> join condition could be something else entirely. I guess we could
> scan the target table with a dirty snapshot and block on any in-doubt
> tuples, similar to what EXCLUDE constraints do internally, but
> throwing MVCC out the window doesn't seem right either.
>

You've answered Tom's question of why not to just INSERT and trap the
error here. Presuming a unique key violation is what will kick back in
this situation and to just follow the other side doesn't cover all the
ways this can get used.

I'm thinking of this problem now as being like the one that happens when
a concurrent UPDATE happens. To quote from the docs here:

"a target row might have already been updated (or deleted or locked) by
another concurrent transaction by the time it is found. In this case,
the would-be updater will wait for the first updating transaction to
commit or roll back (if it is still in progress). If the first updater
rolls back, then its effects are negated and the second updater can
proceed with updating the originally found row. If the first updater
commits, the second updater will ignore the row if the first updater
deleted it, otherwise it will attempt to apply its operation to the
updated version of the row. The search condition of the command (the
WHERE clause) is re-evaluated to see if the updated version of the row
still matches the search condition. If so, the second updater proceeds
with its operation using the updated version of the row."

What I think we can do with adding a key reservation is apply the same
sort of logic to INSERTs too, given a way to lock an index entry before
the row itself is complete. If MERGE hits a WHERE condition that finds
a key lock entry in the index, it has to sit and wait for that other
command to finish. There isn't any other sensible behavior I can see in
that situation, just like there isn't one for UPDATE right now.

If the transaction that was locking the index entry commits, it has to
start over with re-evaluating the WHEN MATCHED part (the equivilent of
the WHERE in the UPDATE case). But it shouldn't need to do a new JOIN
again, because that condition was proven to be met already by the lock
squatting on that index key, without even having the rest of the row
there yet to check its data. If the other entry commits, it would then
follow the WHEN MATCHED path and in the UPSERT case do an UPDATE. If
the transaction who had the key reservervation rolls back, the
re-evaluation of the MERGE WHEN MATCHED will fail, at which point it
follows the WHERE FOUND INSERT path. As it will now be the locker of
the key reservation it had been waiting on earlier at that point, it
will be free to do the INSERT with no concerns about race conditions or
retries. In the SERIALIZABLE case, again just try to follow the same
slightly different rules that exist for concurrent UPDATE commands.

There may be a tricky point here that we will need to warn people that
UPSERT implementations need to make sure they only reference the columns
of the primary key in the MERGE conditions for this to work as expected,
because otherwise they might get back a unique key violation error
anyway. I don't know how other databases deal with that particular
problem. I have considered following the somewhat distasteful path of
installing SQL Server or Oracle here just to see how they handle some of
the pathological test cases I'm now thinking about for this command.

This particular weak spot in MVCC is already there for UPDATE, and this
approach seems to me to be the most logical way to extend the same basic
implementation to also eliminate some concurrent MERGE race conditions,
including the most common one the UPSERT case is stuck with. But I have
no intention of halting work on the basic MERGE to wait for this problem
to be solved even if I'm completely wrong about what I've outlined
above. That style of thinking--don't even start until every a perfect
solution for every possible problem is known--is something I consider a
problem in this community's development model, one that has contributed
to why no work has been done on this feature until now. This one splits
nicely into "get the implemenation working" and "improve the
concurrency" parts, and I haven't heard a good reason yet why those need
to coupled together.

--
Greg Smith 2ndQuadrant US greg(at)2ndQuadrant(dot)com Baltimore, MD
PostgreSQL Training, Services and Support www.2ndQuadrant.us

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2010-10-24 16:20:22 Re: WIP: extensible enums
Previous Message Sushant Sinha 2010-10-24 15:26:28 Re: planner row-estimates for tsvector seems horribly wrong