Re: MERGE Specification

From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Marko Kreen <markokr(at)gmail(dot)com>
Cc: Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>, pgsql-hackers(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Decibel! <decibel(at)decibel(dot)org>, Martijn van Oosterhout <kleptog(at)svana(dot)org>, Gregory Stark <stark(at)enterprisedb(dot)com>, "A(dot)M(dot)" <agentm(at)themactionfaction(dot)com>
Subject: Re: MERGE Specification
Date: 2008-04-28 09:49:03
Message-ID: 1209376143.4391.43.camel@ebony.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, 2008-04-28 at 11:57 +0300, Marko Kreen wrote:
> On 4/25/08, Robert Treat <xzilla(at)users(dot)sourceforge(dot)net> wrote:
> > On Thursday 24 April 2008 23:40, Tom Lane wrote:
> > > Robert Treat <xzilla(at)users(dot)sourceforge(dot)net> writes:
> > > > Perhaps a better option would be to implement Merge per spec, and then
> > > > implement a "replace into" command for the oltp scenario. This way you
> > > > keep the spec behavior for the spec syntax, and have a clearly non-spec
> > > > command for non-spec behavior.
> > >
> > > In that case, it's a fair question to ask just who will use the "spec"
> > > syntax. As far as I can tell from years of watching the mailing lists,
> > > there is plenty of demand for a concurrent-safe insert-or-update
> > > behavior, and *exactly zero* demand for the other. I challenge you to
> > > find even one request for the "spec" behavior in the mailing list
> > > archives. (Simon doesn't count.)
> > >
> >
> >
> > AIUI the current implementation is designed to avoid race conditions partially
> > at the cost of being insert friendly and somewhat update unfriendly. My guess
> > is that most of the people wanting this for OLTP use want an update friendly
> > implementation (that's certainly been the majority of cases I've needed
> > myself, and that I have seen others use).
>
> This seems to hint that there should be 2 variants of merge/upsert
> - insert-friendly and update-friendly... It seems unlikely one implementation
> can be both. And especially bad would be implementation that is neither.

Not sure what an option that was "neither" would look like ...

I would summarise the two MERGE behaviour proposals as

1. Correctly protects against concurrent inserts. Uses one
sub-transaction per row and leaves 2 dead rows per update. Requires us
to perform tasks in different order than required by SQL spec, but the
end result seems identical to me (now).
Has been noted as suitable for OLTP, and poor for bulk data maintenance.
Has been described as "insert-friendly" and "non-spec".

2. Does not protect against concurrent inserts. Leaves 1 dead row per
update. Much more efficient for updates, not sure about any efficiency
gain for inserts.
Has been noted as being unsuitable for OLTP, though likely to offer more
acceptable performance for bulk operations.
Has been described as "update-friendly".

By consensus, I'm doing (1).

It looks likely that doing (2) should be fairly small change and so can
be offered as an option. For example, we can have an additional
action-order clause with two options (the first of which is default)
[INSERT BEFORE UPDATE ACTION ORDER | DEFAULT ACTION ORDER]
So the default is to force inserts to occur before updates, as required
by (1). The other option "DEFAULT ACTION ORDER" tests the WHEN clauses
in the order specified in the statement, allowing the user to choose
whether they want to test for updates or inserts first.

Overall, the difference between these behaviours is small in comparison
with making MERGE work in the first place...

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message dv @ nabble 2008-04-28 09:49:57 SRF in SFRM_ValuePerCall mode
Previous Message Gregory Stark 2008-04-28 08:58:12 Re: Proposed patch - psql wraps at window width