Re: INSERT...ON DUPLICATE KEY LOCK FOR UPDATE

From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Greg Stark <stark(at)mit(dot)edu>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: INSERT...ON DUPLICATE KEY LOCK FOR UPDATE
Date: 2013-09-22 00:07:11
Message-ID: CAM3SWZRXwOGXWjhTbgXM8tTrEhRefgho7tadn_vH5o3x=Wh6KA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Sep 15, 2013 at 8:23 AM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
>> I'll find it very difficult to accept any implementation that is going
>> to bloat things even worse than our upsert looping example.
>
> How would any even halfway sensible example cause *more* bloat than the
> upsert looping thing?

I was away in Chicago over the week, and didn't get to answer this.
Sorry about that.

In the average/uncontended case, the subxact example bloats less than
all alternatives to my design proposed to date (including the "unborn
heap tuple" idea Robert mentioned in passing to me in person the other
day, which I think is somewhat similar to a suggestion of Heikki's
[1]). The average case is very important, because in general
contention usually doesn't happen. But you need to also appreciate
that because of the way row locking works and the guarantees value
locking makes, any ON DUPLICATE KEY LOCK FOR UPDATE implementation is
going to have to potentially restart in more places (as compared to
the doc's example), maybe including value locking of each unique index
and certainly including row locking. So the contended case might even
be worse as well.

On average, it is quite likely that either the UPDATE or INSERT will
succeed - there has to be some concurrent activity around the same
values for either to fail, and in general that's quite unlikely. If
the UPDATE doesn't succeed, it won't bloat, and it's then very likely
that the INSERT at the end of the loop will go ahead and succeed
without itself creating bloat.

Going forward with this discussion, I would like us all to take as
read that the buffer locking stuff is a prototype approach to value
locking, to be refined later (subject to my basic design being judged
fundamentally sound). I don't think anyone believes that it's
fundamentally incorrect in that it doesn't do something that it claims
to do (concerns are more around what it might do or prevent that it
shouldn't), and it can still drive discussion in a very useful
direction. So far criticism of this patch has almost entirely been on
aspects of buffer locking, but it would be much more useful for the
time being to simply assume that the buffer locks *are* interruptible.
It's probably okay with me to still be a bit suspicious of
deadlocking, though, because if we refine the buffer locking using a
more granular SLRU value locking approach, that doesn't necessarily
guarantee that it's impossible, even if it does (I guess) prevent
undesirable interactions with other buffer locking.

[1] http://www.postgresql.org/message-id/45E845C4.6030000@enterprisedb.com

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Steve Singer 2013-09-22 00:33:49 Re: logical changeset generation v6
Previous Message Andrew Dunstan 2013-09-21 23:08:35 Re: VMs for Reviewers Available