Re: ask for review of MERGE

From: Greg Stark <gsstark(at)mit(dot)edu>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Greg Smith <greg(at)2ndquadrant(dot)com>, Marko Tiikkaja <marko(dot)tiikkaja(at)cs(dot)helsinki(dot)fi>, Boxuan Zhai <bxzhai2010(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Martijn van Oosterhout <kleptog(at)svana(dot)org>
Subject: Re: ask for review of MERGE
Date: 2010-10-25 20:10:48
Message-ID: AANLkTi=a1+AWn0wpmWXNLHWt1AsmU+q3pajPuSdN4SfF@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Oct 25, 2010 at 12:40 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> Now, as Greg says, that might be what some people want, but it's
> certainly monumentally unserializable.

To be clear when I said it's what people want what I meant was that in
the common cases it's doing exactly what people want. As opposed to
getting closer to what people want in general but not quite hitting
the mark in the common cases.

Just as an example I think it's important that in the simplest case,
upsert of a single record, it be 100% guaranteed to do the naive
upsert. If two users are doing the merge of a single key at the same
time one of them had better insert and one of them had better update
or else users are going to be monumentally surprised.

I guess I hadn't considered all the cases and I agree it's important
that our behaviour make some kind of sense and be consistent with how
we handle updates and of existing in-doubt tuples. I wasn't trying to
introduce a whole new mode of operation, just work from analogy from
the way update works. It's clear that even with our existing semantics
there are strange corner cases once you get to multiple updates
happening in a single transaction. But we get the simple cases right
and even in the more complex cases, while it's not truly serializable
we should be able to come up with some basic smell tests that we pass.

My understanding is that currently we generally treat DML in one of
two ways depending on whether it's returning data to the user or
updating data in the table (include select for share). If it's
returning data to the user we use a snapshot to give the user a
consistent view of the database. If it's altering data in the database
we use the snapshot to get a consistent set of records and then apply
the updates to the most recent version.

The anomaly you showed with update and the problem with MERGE are both
because the operation was simultaneously doing a "read" -- the WHERE
clause and the uniqueness check in the MERGE -- and a write. This is
already the kind of case where we do weird things -- what kind of
behaviour would be consistent with our existing, somewhat weird,
behaviour?

--
greg

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2010-10-25 20:14:18 Re: add label to enum syntax
Previous Message Pavel Stehule 2010-10-25 20:10:21 Re: foreign keys for array/period contains relationships