Quick Links

bug in fast-path locking

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Boszormenyi Zoltan <zb(at)cybertec(dot)at>
Cc:	Cousin Marc <cousinmarc(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Hans-Juergen Schoenig <hs(at)cybertec(dot)at>, Ants Aasma <ants(at)cybertec(dot)at>
Subject:	bug in fast-path locking
Date:	2012-04-09 01:37:23
Message-ID:	CA+TgmobyD_4_NR5wVs7N6W5be9k6F0yQLTGNg4_jV5OUvesm8A@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Sun, Apr 8, 2012 at 12:43 PM, Boszormenyi Zoltan <zb(at)cybertec(dot)at> wrote:
>> Indeed, the unpatched GIT version crashes if you enter
>> =#lock TABLE pgbench_accounts ;
>> the second time in session 2 after the first one failed. Also,
>> manually spelling it out:
>>
>> Session 1:
>>
>> $ psql
>> psql (9.2devel)
>> Type "help" for help.
>>
>> zozo=# begin;
>> BEGIN
>> zozo=# lock table pgbench_accounts;
>> LOCK TABLE
>> zozo=#
>>
>> Session 2:
>>
>> zozo=# begin;
>> BEGIN
>> zozo=# savepoint a;
>> SAVEPOINT
>> zozo=# lock table pgbench_accounts;
>> ERROR: canceling statement due to statement timeout
>> zozo=# rollback to a;
>> ROLLBACK
>> zozo=# savepoint b;
>> SAVEPOINT
>> zozo=# lock table pgbench_accounts;
>> The connection to the server was lost. Attempting reset: Failed.
>> !>
>>
>> Server log after the second lock table:
>>
>> TRAP: FailedAssertion("!(locallock->holdsStrongLockCount == 0)", File:
>> "lock.c", Line: 749)
>> LOG: server process (PID 12978) was terminated by signal 6: Aborted
>
>
> Robert, the Assert triggering with the above procedure
> is in your "fast path" locking code with current GIT.

Yes, that sure looks like a bug. It seems that if the top-level
transaction is aborting, then LockReleaseAll() is called and
everything gets cleaned up properly; or if a subtransaction is
aborting after the lock is fully granted, then the locks held by the
subtransaction are released one at a time using LockRelease(), but if
the subtransaction is aborted *during the lock wait* then we only do
LockWaitCancel(), which doesn't clean up the LOCALLOCK. Before the
fast-lock patch, that didn't really matter, but now it does, because
that LOCALLOCK is tracking the fact that we're holding onto a shared
resource - the strong lock count. So I think that LockWaitCancel()
needs some kind of adjustment, but I haven't figured out exactly what
it is yet.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Responses

Re: bug in fast-path locking at 2012-04-09 17:32:39 from Robert Haas

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Noah Misch	2012-04-09 02:15:19	Re: ECPG FETCH readahead
Previous Message	Adrian Klaver	2012-04-08 23:41:31	Re: 9.1.3 Standby catchup mode