Re: BUG #13681: Serialization failures caused by new multixact code of 9.3 (back-patch request)

From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Kevin Grittner <kgrittn(at)gmail(dot)com>
Cc: odo(at)odoo(dot)com, "pgsql-bugs(at)postgresql(dot)org" <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: BUG #13681: Serialization failures caused by new multixact code of 9.3 (back-patch request)
Date: 2015-12-18 18:53:35
Message-ID: 20151218185335.GU2618@alvherre.pgsql
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Alvaro Herrera wrote:

> diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
> index 559970f..aaf8e8e 100644
> --- a/src/backend/access/heap/heapam.c
> +++ b/src/backend/access/heap/heapam.c
> @@ -4005,7 +4005,7 @@ l3:
> UnlockReleaseBuffer(*buffer);
> elog(ERROR, "attempted to lock invisible tuple");
> }
> - else if (result == HeapTupleBeingUpdated)
> + else if (result == HeapTupleBeingUpdated || result == HeapTupleUpdated)
> {
> TransactionId xwait;
> uint16 infomask;
>
> I think heap_lock_rows had that shape (only consider BeingUpdated as a
> reason to check/wait) only because it was possible to lock a row that
> was being locked by someone else, but it wasn't possible to lock a row
> that had been updated by someone else -- which became possible in 9.3.
> So this patch is necessary, and not just to fix this one bug.
>
> I need to study the implications of this patch more closely, but I think
> in theory it is correct.

Hmm. So one thing against this patch is that some results of existing
isolation tests change. For instance, this one (lock-update-delete):

starting permutation: s2b s1l s2u s2_blocker3 s2c s2_unlock
pg_advisory_lock


step s2b: BEGIN;
step s1l: SELECT * FROM foo WHERE pg_advisory_xact_lock(0) IS NOT NULL AND key = 1 FOR KEY SHARE; <waiting ...>
step s2u: UPDATE foo SET value = 2 WHERE key = 1;
step s2_blocker3: UPDATE foo SET value = 2 WHERE key = 1;
step s2c: COMMIT;
step s2_unlock: SELECT pg_advisory_unlock(0);
pg_advisory_unlock

t
step s1l: <... completed>
key value

1 2

The initial SELECT is blocked and returns the updated tuple; if I patch
as proposed above, the returned tuple is the *original* tuple (i.e. it
returns value=1 instead). I think there's room for arguing that that is
the correct result; since the tuple lock acquired by FOR KEY SHARE does
not conflict with the UPDATE, what would be the reason to reject the
original tuple? In fact, in the permutation that does the
advisory_unlock before the commit, that's exactly what happens (value=1
is returned).

Also, in the repeatable read isolation level, the expected file has this
permutation as failing with "could not serialize" -- but with the patch,
it no longer fails and instead it returns the value=1 tuple instead.
Kevin, what's your opinion on that change? Is this case supposed to
raise an error or not?

I have a hard time convincing myself that it's acceptable to back-patch
such a change, in any case. I observed no other regression failure, but
what did change does make me a bit uncomfortable.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Kevin Grittner 2015-12-18 21:00:58 Re: BUG #13681: Serialization failures caused by new multixact code of 9.3 (back-patch request)
Previous Message Peter J. Holzer 2015-12-18 18:47:07 Re: BUG #13817: Query planner strange choose while select/count small part of big table - complete