From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Boszormenyi Zoltan <zb(at)cybertec(dot)at> |
Cc: | Cousin Marc <cousinmarc(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Hans-Juergen Schoenig <hs(at)cybertec(dot)at>, Ants Aasma <ants(at)cybertec(dot)at> |
Subject: | bug in fast-path locking |
Date: | 2012-04-09 01:37:23 |
Message-ID: | CA+TgmobyD_4_NR5wVs7N6W5be9k6F0yQLTGNg4_jV5OUvesm8A@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sun, Apr 8, 2012 at 12:43 PM, Boszormenyi Zoltan <zb(at)cybertec(dot)at> wrote:
>> Indeed, the unpatched GIT version crashes if you enter
>> =#lock TABLE pgbench_accounts ;
>> the second time in session 2 after the first one failed. Also,
>> manually spelling it out:
>>
>> Session 1:
>>
>> $ psql
>> psql (9.2devel)
>> Type "help" for help.
>>
>> zozo=# begin;
>> BEGIN
>> zozo=# lock table pgbench_accounts;
>> LOCK TABLE
>> zozo=#
>>
>> Session 2:
>>
>> zozo=# begin;
>> BEGIN
>> zozo=# savepoint a;
>> SAVEPOINT
>> zozo=# lock table pgbench_accounts;
>> ERROR: canceling statement due to statement timeout
>> zozo=# rollback to a;
>> ROLLBACK
>> zozo=# savepoint b;
>> SAVEPOINT
>> zozo=# lock table pgbench_accounts;
>> The connection to the server was lost. Attempting reset: Failed.
>> !>
>>
>> Server log after the second lock table:
>>
>> TRAP: FailedAssertion("!(locallock->holdsStrongLockCount == 0)", File:
>> "lock.c", Line: 749)
>> LOG: server process (PID 12978) was terminated by signal 6: Aborted
>
>
> Robert, the Assert triggering with the above procedure
> is in your "fast path" locking code with current GIT.
Yes, that sure looks like a bug. It seems that if the top-level
transaction is aborting, then LockReleaseAll() is called and
everything gets cleaned up properly; or if a subtransaction is
aborting after the lock is fully granted, then the locks held by the
subtransaction are released one at a time using LockRelease(), but if
the subtransaction is aborted *during the lock wait* then we only do
LockWaitCancel(), which doesn't clean up the LOCALLOCK. Before the
fast-lock patch, that didn't really matter, but now it does, because
that LOCALLOCK is tracking the fact that we're holding onto a shared
resource - the strong lock count. So I think that LockWaitCancel()
needs some kind of adjustment, but I haven't figured out exactly what
it is yet.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | Noah Misch | 2012-04-09 02:15:19 | Re: ECPG FETCH readahead |
Previous Message | Adrian Klaver | 2012-04-08 23:41:31 | Re: 9.1.3 Standby catchup mode |