Cleaning up recovery from subtransaction start failure

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alvaro Herrera <alvherre(at)dcc(dot)uchile(dot)cl>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Cleaning up recovery from subtransaction start failure
Date: 2004-09-13 22:55:38
Message-ID: 27135.1095116138@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

BTW, I'm not going to make the lock release/XactLockTableWait fix just yet,
because exhaustion of shared memory provides an easy test case for a
problem that I want to fix first. What I noticed while testing the
reported case is that you get

WARNING: out of shared memory
CONTEXT: PL/pgSQL function "do_standard_mgc" line 2 at block variables initialization

(this is ShmemAlloc choking)

ERROR: out of shared memory
HINT: You may need to increase max_locks_per_transaction.
CONTEXT: PL/pgSQL function "do_standard_mgc" line 2 at block variables initialization

(this is LockAcquire choking as a result)

WARNING: StartAbortedSubTransaction while in START state

(this is xact.c trying to cope with the resultant error exit)

and after that we are stuck inside an aborted subtransaction:

(gdb) p *CurrentTransactionState
$2 = {transactionIdData = 0, name = 0x0, savepointLevel = 0, commandId = 8414,
state = TRANS_ABORT, blockState = TBLOCK_SUBABORT, nestingLevel = 2,
curTransactionContext = 0x4011b6d0, curTransactionOwner = 0x4017b1f8,
childXids = 0x0, currentUser = 0, prevXactReadOnly = 0 '\000',
parent = 0x40001130}
(gdb) p *CurrentTransactionState->parent
$3 = {transactionIdData = 78724, name = 0x0, savepointLevel = 0,
commandId = 8414, state = TRANS_INPROGRESS, blockState = TBLOCK_STARTED,
nestingLevel = 1, curTransactionContext = 0x400bc6b8,
curTransactionOwner = 0x40069d58, childXids = 0x400c3030, currentUser = 0,
prevXactReadOnly = 0 '\000', parent = 0x0}
(gdb)

which means that I have to type "abort;" to get the system back into a
usable state. Since I did not type "begin;" first, that is clearly a
bug.

I don't actually like StartAbortedSubTransaction at all --- ISTM that if
you get a failure trying to enter a subxact, it's better *not* to enter
the subxact and instead to treat the error as putting the calling xact
in abort state.

Also, I found a problem a couple days ago with errors occurring during
transaction commit, for instance a sequence like
begin;
set constraints all deferred;
insert into foo () -- assume this violates a deferred FK constraint
savepoint a;
commit;

The problem is that the COMMIT command only changed the state of the
topmost transaction record (the savepoint's subxact) which we then pop
off the stack during commit. After that we run deferred triggers.
If we get an error at that stage, there is nothing left on the state
stack to remind us that the user had typed commit ... and so guess what,
we stay inside the aborted transaction, and the user has the same
problem of needing to type abort when he shouldn't.

I put in what I think is a correct fix for that one here:
http://developer.postgresql.org/cvsweb.cgi/pgsql-server/src/backend/access/transam/xact.c.diff?r1=1.186;r2=1.187
(ignore the renamings of trigger functions in the same patch).
Basically the idea is that COMMIT should mark the *entire* state stack
before we start to pop anything, and then we will still remember what we
were doing if we get an error after the first pop.

Since then I have been thinking that we should get rid of some of the
subxact-abort-related states in favor of handling abort cases similarly.
I don't have a test case to prove that there's a problem, but in general
errors during error aborts are a real threat that has to be considered.

Thoughts?

regards, tom lane

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Paul Tillotson 2004-09-14 00:06:08 Re: pg_locks view and user locks
Previous Message Tom Lane 2004-09-13 22:36:16 Re: beta1 & beta2 & Windows & heavy load