Re: Why does execReplication.c lock tuples?

From: Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)lists(dot)postgresql(dot)org, pgsql-hackers(at)postgresql(dot)org, Peter Eisentraut <peter_e(at)gmx(dot)net>, Petr Jelinek <petr(at)2ndquadrant(dot)com>
Subject: Re: Why does execReplication.c lock tuples?
Date: 2019-01-29 16:22:43
Message-ID: c6270c98-4057-2d14-a361-a533849a6310@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 29/01/2019 17:14, Andres Freund wrote:
> Hi
>
> On January 29, 2019 8:04:55 AM PST, Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com> wrote:
>> On 29/01/2019 16:28, Andres Freund wrote:
>>> Hi,
>>>
>>> On January 29, 2019 4:19:52 AM PST, Petr Jelinek
>> <petr(dot)jelinek(at)2ndquadrant(dot)com> wrote:
>>>> Hi,
>>>>
>>>> On 20/01/2019 21:03, Andres Freund wrote:
>>>>> Hi,
>>>>>
>>>>> Currently RelationFindReplTupleByIndex(),
>> RelationFindReplTupleSeq()
>>>>> lock the found tuple. I don't quite get what that achieves - why
>>>> isn't
>>>>> dealing with concurrency in the table_update/delete calls at the
>>>>> callsites sufficient? As far as I can tell there's no meaningful
>>>>> concurrency handling in the heap_lock_tuple() paths, so it's not
>> like
>>>> we
>>>>> follow update chains or anything.
>>>>>
>>>>
>>>> Yeah that's leftover from the conflict detection/handling code that
>> I
>>>> stripped away to keep the patched manageable size-wise. As things
>> stand
>>>> now we could remove that and use normal heap_update instead of
>> simple
>>>> variant. It'll be likely be needed again if we add conflict handling
>> in
>>>> the future, but perhaps we could be smarter about it then (i.e. I
>> can
>>>> imagine that it will be per table anyway, not necessarily default
>>>> behavior).
>>>
>>> Why does conflict handling need the unconditional lock? Wouldn't just
>> doing that after an initial heap_update returned HeapTupleUpdated make
>> more sense? And wouldn't it need to reckeck the row afterwards as
>> well?
>>>
>>
>> To prevent tuple changing from under the conflict resolution logic - we
>> need to make sure that whatever tuple the conflict resolution logic
>> gets
>> is the current one. If the tuple is locked you won't get the
>> HeapTupleUpdated in heap_update anymore (which is why we can use
>> simple_heap_update).
>>
>> In any case we don't have conflict resolution at the moment so it's
>> probably moot right now.
>
> Not sure why that's relevant - what you'd do is to attempt the update, if it succeeds: great. If not, then you'd do the lock tuple and after that do the conflict resolution. Less WAL and cpu for the common case, same WAL as now for the conflict, with just a small bit of additional CPU costs for the latter.
>

But the conflict is not about local node concurrency, rather multi-node
one (ie, merging data from multiple nodes) so you can't depend on
heap_update return value alone. You usually need to figure out if there
is conflict before you even do update (based on tuple metadata/data)
which is why the lock is necessary.

> Either way, right now it's definitely superfluous...
>

Agreed.

--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Sven Berkvens-Matthijsse 2019-01-29 16:25:53 Re: Follow-up on INSERT INTO ... SET ...
Previous Message Andres Freund 2019-01-29 16:14:32 Re: Why does execReplication.c lock tuples?