Skip site navigation (1) Skip section navigation (2)

Re: ask for review of MERGE

From: Greg Smith <greg(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Boxuan Zhai <bxzhai2010(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: ask for review of MERGE
Date: 2010-10-21 18:36:17
Message-ID: 4CC08821.3020102@2ndquadrant.com (view raw or flat)
Thread:
Lists: pgsql-hackers
Robert Haas wrote:
> I think the right way to write UPSERT is something
> along the lines of:
>
> MERGE INTO Stock t USING (VALUES (10, 1)) s(item_id, balance) ON
> s.item_id = t.item_id ...
>   

That led in the right direction, after a bit more fiddling I was finally 
able to get something that does what I wanted:  a single table UPSERT 
implemented with this MERGE implementation.  Here's a log of a test 
session, suitable for eventual inclusion in the regression tests:

CREATE TABLE Stock(item_id int UNIQUE, balance int);
INSERT INTO Stock VALUES (10, 2200);
INSERT INTO Stock VALUES (20, 1900);
SELECT * FROM Stock ORDER BY item_id;

 item_id | balance
---------+---------
      10 |    2200
      20 |    1900

MERGE INTO Stock t
 USING (VALUES(10,100)) AS s(item_id,balance)
 ON s.item_id=t.item_id
 WHEN MATCHED THEN UPDATE SET balance=t.balance + s.balance
 WHEN NOT MATCHED THEN INSERT VALUES(s.item_id,s.balance)
 ;

MERGE 1

SELECT * FROM Stock ORDER BY item_id;
 item_id | balance
---------+---------
      10 |    2300
      20 |    1900

MERGE INTO Stock t
 USING (VALUES(30,2000)) AS s(item_id,balance)
 ON s.item_id=t.item_id
 WHEN MATCHED THEN UPDATE SET balance=t.balance + s.balance
 WHEN NOT MATCHED THEN INSERT VALUES(s.item_id,s.balance)
 ;

MERGE 1
SELECT * FROM Stock ORDER BY item_id;
 item_id | balance
---------+---------
      10 |    2300
      20 |    1900
      30 |    2000

I'm still a little uncertain as to whether any of my other examples 
should have worked under the spec but just didn't work here, but I'll 
worry about that later.

Here's what the query plan looks like on a MATCH:

 Merge  (cost=0.00..8.29 rows=1 width=22) (actual time=0.166..0.166 
rows=0 loops=1)
   Action 1: Update When Matched
   Action 2: Insert When Not Mactched
   MainPlan:
   ->  Nested Loop Left Join  (cost=0.00..8.29 rows=1 width=22) (actual 
time=0.050..0.061 rows=1 loops=1)
         ->  Values Scan on "*VALUES*"  (cost=0.00..0.01 rows=1 width=8) 
(actual time=0.009..0.010 rows=1 loops=1)
         ->  Index Scan using stock_item_id_key on stock t  
(cost=0.00..8.27 rows=1 width=14) (actual time=0.026..0.030 rows=1 loops=1)
               Index Cond: ("*VALUES*".column1 = item_id)
 Total runtime: 0.370 ms


And here's a miss:

 Merge  (cost=0.00..8.29 rows=1 width=22) (actual time=0.145..0.145 
rows=0 loops=1)
   Action 1: Update When Matched
   Action 2: Insert When Not Mactched
   MainPlan:
   ->  Nested Loop Left Join  (cost=0.00..8.29 rows=1 width=22) (actual 
time=0.028..0.033 rows=1 loops=1)
         ->  Values Scan on "*VALUES*"  (cost=0.00..0.01 rows=1 width=8) 
(actual time=0.004..0.005 rows=1 loops=1)
         ->  Index Scan using stock_item_id_key on stock t  
(cost=0.00..8.27 rows=1 width=14) (actual time=0.015..0.015 rows=0 loops=1)
               Index Cond: ("*VALUES*".column1 = item_id)
 Total runtime: 0.255 ms

Next steps here:
1) Performance/concurrency tests against trigger-based UPSERT approach.
2) Finish bit rot cleanup against HEAD.
3) Work out more complicated test cases to try and fine more unexpected 
behavior edge cases and general bugs.

-- 
Greg Smith, 2ndQuadrant US greg(at)2ndQuadrant(dot)com Baltimore, MD
PostgreSQL Training, Services and Support  www.2ndQuadrant.us



In response to

Responses

pgsql-hackers by date

Next:From: Kevin GrittnerDate: 2010-10-21 18:37:50
Subject: Re: Serializable snapshot isolation patch
Previous:From: Alvaro HerreraDate: 2010-10-21 18:32:53
Subject: Re: Bug in plpython's Python Generators

Privacy Policy | About PostgreSQL
Copyright © 1996-2013 The PostgreSQL Global Development Group