Re: UPSERT wiki page, and SQL MERGE syntax

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Kevin Grittner <kgrittn(at)ymail(dot)com>
Cc: Peter Geoghegan <pg(at)heroku(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>
Subject: Re: UPSERT wiki page, and SQL MERGE syntax
Date: 2014-10-10 18:05:27
Message-ID: CA+TgmoZN=2AJKi1n4Jz5BkmYi8r_CPUDW+DtoppmTeLVmsOoqw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Oct 8, 2014 at 5:01 PM, Kevin Grittner <kgrittn(at)ymail(dot)com> wrote:
> Peter Geoghegan <pg(at)heroku(dot)com> wrote:
>> On Mon, Oct 6, 2014 at 8:35 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>>> I think the problem is that it's not possible to respect the "usual
>>> guarantees" even at READ COMMITTED level when performing an INSERT OR
>>> UPDATE operation (however spelled). You may find that there's a tuple
>>> with the same PK which is committed but not visible to the snapshot
>>> you took at the beginning of the statement.
>>
>> Can you please comment on this, Kevin? It would be nice to converge on
>> an agreement on syntax here
>
> Robert said "however spelled" -- which I take to mean that he at
> least sees that the MERGE-like UPSERT syntax can be turned into the
> desired semantics. I have no idea why anyone would think otherwise.
>
> Although the last go-around does suggest that there is at least one
> point of difference on the semantics. You seem to want to fire the
> BEFORE INSERT triggers before determining whether this will be an
> INSERT or an UPDATE.

OK, I have a comment on this.

Anything we do about triggers will by definition be novel. Right now,
we have INSERT, UPDATE, and DELETE. If we add a new operation,
whether it's called UPSERT or MERGE or FROB, or if we add a flag to
insert that makes it possibly do something other than inserting (e.g.
INSERT OR UPDATE), or if we add a flag to update that makes it do
something other than updating (e.g. UPDATE or INSERT), then some
people's triggers are going to get broken by that change no matter how
we do it. When the new operation is invoked, we can fire the insert
triggers, the update triggers, some new kind of trigger, or no trigger
at all - and any decision we make there will not in all cases be
backward-compatible. We can try to minimize the damage (and we
probably should) but we can't make it go to zero.

> INSERT INTO targettable(key, quantity, inserted_at)
> VALUES(123, quantity, now())
> ON CONFLICT WITHIN targettable_pkey
> UPDATE SET quantity = quantity + CONFLICTING(quantity), updated_at = now();

I actually like this syntax reasonably well in some ways, but I don't
like that we're mentioning the index name, and the CONFLICTING()
notation is decidedly odd. Maybe something like this:

INSERT INTO targettable(key, quantity, inserted_at)
VALUES(123, quantity, now())
ON DUPLICATE (key)
UPDATE SET quantity = quantity + OLD.quantity, updated_at = now();

(Perhaps OLD should reference the tuple already in the table, and NEW
the value from the VALUES clause. That would be snazzy indeed.)

Also, how about making the SET clause optional, with the semantics
that we just update all of the fields for which a value is explicitly
specified:

INSERT INTO overwrite_with_abandon (key, value)
VALUES (42, 'meaning of life')
ON DUPLICATE (key) UPDATE;

While the ability to specify a SET clause there explicitly is useful,
I bet a lot of key-value store users would love the heck out of a
syntax that let them omit it when they want to overwrite.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2014-10-10 18:08:38 Re: bad estimation together with large work_mem generates terrible slow hash joins
Previous Message Kevin Grittner 2014-10-10 18:03:13 Re: Column Redaction