Re: BUG #13681: Serialization failures caused by new multixact code of 9.3 (back-patch request)

From: Olivier Dony <odo+pgbugs(at)odoo(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Kevin Grittner <kgrittn(at)gmail(dot)com>, Olivier Dony <odo(at)odoo(dot)com>, "pgsql-bugs(at)postgresql(dot)org" <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: BUG #13681: Serialization failures caused by new multixact code of 9.3 (back-patch request)
Date: 2015-12-22 01:55:20
Message-ID: CAP4GjTKdFn4MF+GndusWWxOJNF2aQ=DYrPYoNRytEKkm2twUZg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Fri, Dec 18, 2015 at 7:53 PM, Alvaro Herrera
<alvherre(at)2ndquadrant(dot)com> wrote:
>
> Alvaro Herrera wrote:
>
> > diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
> > index 559970f..aaf8e8e 100644
> > --- a/src/backend/access/heap/heapam.c
> > +++ b/src/backend/access/heap/heapam.c
> > @@ -4005,7 +4005,7 @@ l3:
> > UnlockReleaseBuffer(*buffer);
> > elog(ERROR, "attempted to lock invisible tuple");
> > }
> > - else if (result == HeapTupleBeingUpdated)
> > + else if (result == HeapTupleBeingUpdated || result == HeapTupleUpdated)
> > {
> > TransactionId xwait;
> > uint16 infomask;
> >
> > I think heap_lock_rows had that shape (only consider BeingUpdated as a
> > reason to check/wait) only because it was possible to lock a row that
> > was being locked by someone else, but it wasn't possible to lock a row
> > that had been updated by someone else -- which became possible in 9.3.
> > So this patch is necessary, and not just to fix this one bug.

I was surprised as well to see that the initial patch, supposed to be
an optimization, would be fixing this bug.
It's starting to make more sense with your analysis.

> (...)
> I have a hard time convincing myself that it's acceptable to back-patch
> such a change, in any case. I observed no other regression failure, but
> what did change does make me a bit uncomfortable.

I'm afraid I won't be of much help to assess the consequences of the
patch, the PG source code and internal data structures are still
pretty new to me.

Would it count somehow in the balance that this use case worked fine
in 9.2, seems to be fine with regard to the documented behavior of
row-level locks and REPEATABLE READ isolation level, but suddenly
stopped working in 9.3/9.4? I suppose 9.3 is an old story, but 9.4 has
the same problem and will be around for a while, being the default
version in most LTS/stable distributions.

Thanks so much to both of you for looking further into this bug!

PS: I can confirm that the patch works, but you didn't need my confirmation ;-)

--
Olivier

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Jeff Janes 2015-12-22 03:45:21 Re: [BUGS] GIN index isn’t working with intarray
Previous Message Tom Lane 2015-12-21 23:46:25 Re: BUG #9923: "reassign owned" does not change permissions grantor