Re: INSERT ... ON CONFLICT {UPDATE | IGNORE} 2.0

From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Andres Freund <andres(at)2ndquadrant(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Thom Brown <thom(at)linux(dot)com>
Subject: Re: INSERT ... ON CONFLICT {UPDATE | IGNORE} 2.0
Date: 2015-02-19 13:21:34
Message-ID: 54E5E35E.2070305@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 02/18/2015 11:43 PM, Peter Geoghegan wrote:
> Heikki seemed to think that the deadlock problems were not really
> worth fixing independently of ON CONFLICT UPDATE support, but rather
> represented a useful way of committing code incrementally. Do I have
> that right?

Yes.

> The way I chose to break the livelock (what I call "livelock
> insurance") does indeed involve comparing XIDs, which Heikki thought
> most promising. But it also involves having the oldest XID wait on
> another session's speculative token in the second phase, which
> ordinarily does not occur in the second
> phase/check_exclusion_or_unique_constraint() call. The idea is that
> one session is guaranteed to be the waiter that has a second iteration
> within its second-phase check_exclusion_or_unique_constraint() call,
> where (following the super deletion of conflict TIDs by the other
> conflicting session or sessions) reliably finds that it can proceed
> (those other sessions are denied the opportunity to reach their second
> phase by our second phase waiter's still-not-super-deleted tuple).
>
> However, it's difficult to see how to map this on to general exclusion
> constraint insertion + enforcement. In Heikki's recent sketch of this
> [1], there is no pre-check, since that is considered an "UPSERT thing"
> deferred until a later patch, and therefore my scheme here cannot work
> (recall that for plain inserts with exclusion constraints, we insert
> first and check last). I have a hard time justifying adding the
> pre-check for plain exclusion constraint inserters given the total
> lack of complaints from the field regarding unprincipled deadlocks,
> and given the fact that it would probably make the code more
> complicated than it needs to be. It is critical that there be a
> pre-check to prevent livelock, though, because otherwise the
> restarting sessions can go straight to their "second" phase
> (check_exclusion_or_unique_constraint() call), without ever realizing
> that they should not do so.

Hmm. I haven't looked at your latest patch, but I don't think you need
to pre-check for this to work. To recap, the situation is that two
backends have already inserted the heap tuple, and then see that the
other backend's tuple conflicts. To avoid a livelock, it's enough that
one backend super-deletes its own tuple first, before waiting for the
other to complete, while the other other backend waits without
super-deleting. No?

> It seems like the livelock insurance is pretty close to or actually
> free, and doesn't imply that a "second phase wait for token" needs to
> happen too often. With heavy contention on a small number of possible
> tuples (100), and 8 clients deleting and inserting, I can see it
> happening only once every couple of hundred milliseconds on my laptop.
> It's not hard to imagine why the code didn't obviously appear to be
> broken before now, as the window for an actual livelock must have been
> small. Also, deadlocks bring about more deadlocks (since the deadlock
> detector kicks in), whereas livelocks do not bring about more
> livelocks.

It might be easier to provoke the livelocks with a GiST opclass that's
unusually slow. I wrote the attached opclass for the purpose of testing
this a while ago, but I haven't actually gotten around to do much with
it. It's called "useless_gist", because it's a GiST opclass for
integers, like btree_gist, but the penalty and picksplit functions are
totally dumb. The result is that the tuples are put to the index in
pretty much random order, and every scan has to scan the whole index.
I'm posting it here, in the hope that it happens to be useful, but I
don't really know if it is.

- Heikki

Attachment Content-Type Size
useless_gist.tar.gz application/gzip 1.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2015-02-19 14:11:45 Re: INSERT ... ON CONFLICT {UPDATE | IGNORE} 2.0
Previous Message Marko Tiikkaja 2015-02-19 10:53:37 xpath changes in the recent back branches