Re: INSERT ... ON CONFLICT {UPDATE | IGNORE} 2.0

From: Peter Geoghegan <pg(at)heroku(dot)com>
To: hlinnaka(at)iki(dot)fi
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Andres Freund <andres(at)2ndquadrant(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Thom Brown <thom(at)linux(dot)com>
Subject: Re: INSERT ... ON CONFLICT {UPDATE | IGNORE} 2.0
Date: 2015-03-02 21:21:48
Message-ID: CAM3SWZSc3-ujrTdKp7OMvVxDLPFfqeSh2FYhoh0VV1qZA5sbjQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Mar 2, 2015 at 12:15 PM, Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:
> Hmm. I used a b-tree to estimate the effect that the locking would have in
> the UPSERT case, for UPSERT into a table with a b-tree index. But you're
> right that for the question of whether this is acceptable for the case of
> regular insert into a table with exclusion constraints, other indexams are
> more interesting. In a very quick test, the overhead with a single GiST
> index seems to be about 5%. IMHO that's still a noticeable slowdown, so
> let's try to find a way to eliminate it.

Honestly, I don't know why you're so optimistic that this can be
fixed, even with that heavyweight lock (for regular inserters +
exclusion constraints).

My experimental branch, which I showed you privately shows big
problems when I simulate upserting with exclusion constraints (so we
insert first, handle exclusion violations using the traditional upsert
subxact pattern that does work with B-Trees). It's much harder to back
out of a heap_update() than it is to back out of a heap_insert(),
since we might well be updated a tuple visible to some other MVCC
snapshot. Whereas we can super delete a tuple knowing that only a
dirty snapshot might have seen it, which bounds the complexity (only
wait sites for B-Trees + exclusion constraints need to worry about
speculative insertion tokens and so on).

My experimental branch works just fine (with a variant jjanes_upsert
with subxact looping), until I need to restart an update after a
"failed" heap_update() that still returned HeapTupleMayBeUpdated
(having super deleted within an ExecUpdate() call). There is no good
way to do that for ExecUpdate() that I can see, because an existing,
visible row is affected (unlike with ExecInsert()). Even if it was
possible, it would be hugely invasive to already very complicated code
paths.

I continue to believe that the best way forward is to incrementally
commit the work by committing ON CONFLICT IGNORE first. That way,
speculative tokens will remain strictly the concern of UPSERTers or
sessions doing INSERT ... ON CONFLICT IGNORE. Unless, you're happy to
have UPDATEs still deadlock, and only fix unprincipled deadlocks for
the INSERT case. I don't know why you want to make the patch
incremental by "fixing" unprincipled deadlocks in the first place,
since you've said that you don't really care about it as a real world
problem, and because it now appears to have significant additional
difficulties that were not anticipated.

Please, let's focus on getting ON CONFLICT IGNORE into a commitable
state - that's the best way of incrementally committing this work.
Fixing unprincipled deadlocks is not a useful way of incrementally
committing this work. I've spent several days producing a prototype
that shows the exact nature of the problem. If you think I'm mistaken,
and that fixing unprincipled deadlocks first is the right thing to do,
please explain why with reference to that prototype. AFAICT, doing
things your way is going to add significant additional complexity for
*no appreciable benefit*. You've already gotten exactly what you were
looking for in every other regard. In particular, ON CONFLICT UPDATE
could work with exclusion constraints without any of these problems,
because of the way we do the auxiliary update there (we lock the row
ahead of updating/qual evaluation). I've bent over backwards to make
sure that is the case. Please, throw me a bone here.

Thank you
--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Stark 2015-03-02 21:23:03 Re: Providing catalog view to pg_hba.conf file - Patch submission
Previous Message Qingqing Zhou 2015-03-02 21:00:31 Re: 32bit OID wrap around concerns