Re: INSERT...ON DUPLICATE KEY LOCK FOR UPDATE

From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Greg Stark <stark(at)mit(dot)edu>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: INSERT...ON DUPLICATE KEY LOCK FOR UPDATE
Date: 2013-09-26 19:07:42
Message-ID: CAM3SWZTCaHj-ayxK+pgFy6dNBQgHTyxhVL=hp9-6fcaK-B5TRg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Sep 26, 2013 at 4:43 AM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> So, I guess my question is if we are only bloating on a contended
> operation, do we expect that to happen so much that bloat is a problem?

Maybe I could have done a better job of explaining the nature of my
concerns around bloat.

I am specifically concerned about bloat and the clean-up of bloat that
occurs between (or during) value locking and eventual row locking,
because of the necessarily opportunistic nature of the way we go from
one to the other. Bloat, and the obligation to clean it up
synchronously, make row lock conflicts more likely. Conflicts make
bloat more likely, because a conflict implies that another iteration,
complete with more bloat, is necessary.

When you consider that the feature will frequently be used with the
assumption that updating is a much more likely outcome, it becomes
clear that we need to be careful about this sort of interplay.

Having said all that, I would have no objection to some reasonable,
bound amount of bloat occurring elsewhere if that made sense. For
example, I'd certainly be happy to consider the question of whether or
not it's worth doing a kind of speculative heap insertion before
acquiring value locks, because that doesn't need to happen again and
again in the same, critical place, in the interim between value
locking and row locking. The advantage of doing that particular thing
would be to reduce the duration that value locks are held - the
disadvantages would be the *usual* disadvantages of bloat. However,
this is obviously a premature discussion to have now, because the
eventual exact nature of value locks are not known.

> I think the big objection to the patch is the additional code complexity
> and the potential to slow down other sessions. If it is only bloating
> on a contended operation, are these two downsides worth avoiding the
> bloat?

I believe that all other schemes proposed have some degree of bloat
even in the uncontended case, because they optimistically assume than
an insert will occur, when in general an update is perhaps just as
likely, and will bloat just the same. So, as I've said before,
definition of uncontended is important here.

There is no reason to assume that alternative proposals will affect
concurrency any less than my proposal - the buffer locking thing
certainly isn't essential to my design. You need to weigh things like
WAL-logging multiples times, that other proposals have. You're right
to say that all of this is complex, but I really think that quite
apart from anything else, my design is simpler than others. For
example, the design that Robert sketched would introduce a fairly
considerable modularity violation, per my recent remarks to him, and
actually plastering over that would be a considerable undertaking.
Now, you might counter, "but those other designs haven't been worked
out enough". That's true, but then my efforts to work them out further
by pointing out problems with them haven't gone very far. I have
sincerely tried to see a way to make them work.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2013-09-26 19:15:16 Re: INSERT...ON DUPLICATE KEY LOCK FOR UPDATE
Previous Message Jim Nasby 2013-09-26 18:58:45 Re: Minmax indexes