Re: ask for review of MERGE

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Greg Stark <gsstark(at)mit(dot)edu>
Cc: Greg Smith <greg(at)2ndquadrant(dot)com>, Martijn van Oosterhout <kleptog(at)svana(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Marko Tiikkaja <marko(dot)tiikkaja(at)cs(dot)helsinki(dot)fi>, Boxuan Zhai <bxzhai2010(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: ask for review of MERGE
Date: 2010-10-25 18:09:54
Message-ID: AANLkTinbxfdaSYQan1fhT0UOKGD20z4xN6XGrc7sOAYu@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Oct 25, 2010 at 1:42 PM, Greg Stark <gsstark(at)mit(dot)edu> wrote:
> On Sun, Oct 24, 2010 at 10:43 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> But let's back up and talk about MVCC for a minute.  Suppose we have
>> three source tuples, (1), (2), and (3); and the target table contains
>> tuples (1) and (2), of which only (1) is visible to our MVCC snapshot;
>> suppose also an equijoin.  Clearly, source tuple (1) should fire the
>> MATCHED rule and source tuple (3) should fire the NOT MATCHED rule,
>> but what in the world should source tuple (2) do?  AFAICS, the only
>> sensible behavior is to throw a serialization error, because no matter
>> what you do the results won't be equivalent to a serial execution of
>> the transaction that committed target tuple (2) and the transaction
>> that contains the MERGE.
>
> So the behaviour we get with UPDATE in this situation is that we
> update (2) so I would expect this to execute the MATCHED rule.

Not exactly. Consider this example:

rhaas=# create table concurrent (x integer primary key);
NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index
"concurrent_pkey" for table "concurrent"
CREATE TABLE
rhaas=# insert into x values (1);
rhaas=# begin;
BEGIN
rhaas=# insert into concurrent values (2);
INSERT 0 1

<switch to a different window>

rhaas=# update concurrent set x=x where x=2;
UPDATE 0

> The key
> distinction is that since we're not returning the data to the user the
> user sees we want to update the most recent version and it's "almost"
> as if we ran "after" all the other transactions. It's not really
> serializable and I think in serializable mode we throw a serialization
> failure instead but in most usage patterns it's precisely what the
> user wants.

I think it would be perfectly reasonable to have a transaction
isolation level that does not use a snapshot at all and instead runs
everything relative to SnapshotNow, and people could use it with MERGE
if they were so inclined. I think this would correspond more or less
to the READ COMMITTED isolation level specified in the standard; what
we now call READ COMMITTED is actually better than READ COMMITTED but
not quite as good as REPEATABLE READ. That, combined with an
exclusive lock on the table (or, perhaps, some kind of predicate
locking mechanism) would be sufficient to prevent serialization
anomalies.

However, I don't think that implementing those semantics for just this
one command (or part of it) makes a whole lot of sense. The EPQ
behavior of our current default isolation level is really pretty
strange, and adding a random wart that the target table (but not the
source table) in a MERGE query gets scanned using SnapshotNow would be
one more piece of strangeness atop the strangeness we already have.
And, as we just saw with the enum stuff, SnapshotNow can lead to some
awfully strange behavior - you could end up processing half of the
data from a concurrent transaction and missing the other half.

> Here "bbb" contained two records when we began with values "1" and "2"
> but the "2" was inserted in a transaction which hadn't committed yet.
> It commited after the update.
>
> postgres=> begin;
> BEGIN
> postgres=> select * from bbb;
>  i
> ---
>  1
> (1 row)
>
> postgres=> update bbb set i = i+1;
> UPDATE 2
> postgres=> commit;
> COMMIT
> postgres=> select * from bbb;
>  i
> ---
>  2
>  3
> (2 rows)

Well, at least on my system, if the transaction inserting (2) hasn't
committed yet, that UPDATE statement will block until it does, because
trying to change i from 1 to 2 causes the update of the unique index
to block, since there's an in-doubt tuple with (2) already. Then it
will continue on as you've shown here, due to EPQ. But if you do the
same statement with i = i + 10 instead of + 1, then it doesn't block,
and only updates the one row that's visible.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2010-10-25 18:11:00 Re: Range Types, discrete and/or continuous
Previous Message Kevin Grittner 2010-10-25 18:07:52 Re: ask for review of MERGE