Re: bug in locking an update tuple chain

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: bug in locking an update tuple chain
Date: 2017-07-18 10:29:14
Message-ID: CAA4eK1L5YYMX8HpKfjoe0_=37n9QRaY8j2=4fu-TO6pAKGnmjw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Jul 15, 2017 at 2:30 AM, Alvaro Herrera
<alvherre(at)2ndquadrant(dot)com> wrote:
> A customer of ours reported a problem in 9.3.14 while inserting tuples
> in a table with a foreign key, with many concurrent transactions doing
> the same: after a few insertions worked sucessfully, a later one would
> return failure indicating that the primary key value was not present in
> the referenced table. It worked fine for them on 9.3.4.
>
> After some research, we determined that the problem disappeared if
> commit this commit was reverted:
>
> Author: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
> Branch: master Release: REL9_6_BR [533e9c6b0] 2016-07-15 14:17:20 -0400
> Branch: REL9_5_STABLE Release: REL9_5_4 [649dd1b58] 2016-07-15 14:17:20 -0400
> Branch: REL9_4_STABLE Release: REL9_4_9 [166873dd0] 2016-07-15 14:17:20 -0400
> Branch: REL9_3_STABLE Release: REL9_3_14 [6c243f90a] 2016-07-15 14:17:20 -0400
>
> Avoid serializability errors when locking a tuple with a committed update
>
> I spent some time writing an isolationtester spec to reproduce the
> problem. It turned out that this required six concurrent sessions in
> order for the problem to show up at all, but once I had that, figuring
> out what was going on was simple: a transaction wants to lock the
> updated version of some tuple, and it does so; and some other
> transaction is also locking the same tuple concurrently in a compatible
> way. So both are okay to proceed concurrently. The problem is that if
> one of them detects that anything changed in the process of doing this
> (such as the other session updating the multixact to include itself,
> both having compatible lock modes), it loops back to ensure xmax/
> infomask are still sane; but heap_lock_updated_tuple_rec is not prepared
> to deal with the situation of "the current transaction has the lock
> already", so it returns a failure and the tuple is returned as "not
> visible" causing the described problem.
>

Your fix seems logical to me, though I have not tested it till now.
However, I wonder why heap_lock_tuple need to restart from the
beginning of update-chain in this case?

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message david.turon 2017-07-18 11:36:32 JSONB - JSONB operator feature request
Previous Message Michael Paquier 2017-07-18 10:10:59 Re: Proposal for CSN based snapshots