Nested transactions RFC

From: Manfred Koizar <mkoi-pg(at)aon(dot)at>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Nested transactions RFC
Date: 2002-05-10 11:12:21
Message-ID: riandu8d681u68h1grteqo7nvh6hmtilis@4ax.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

if it is acceptable for subtransactions to use up transaction numbers,
then here is a half baked RFC for a possible implementation.
If not, forget the rest of this message.

The proposed implementation works much like the current transaction
handling. It needs an additional system table
pg_subtrans (child XactId PRIMARY KEY, parent XactId).

BEGIN; -- starts a new (top level) transaction, say 100

INSERT row1; -- row1.xmin = 100
DELETE row2; -- row2.xmax = 100

BEGIN; -- starts a subtransaction, let's call it 200,
-- stores 100 on the parent transaction stack
-- (a local memory structure),
-- inserts (200, 100) into pg_subtrans

INSERT row3; -- row3.xmin = 200, row3.XMIN_IS_SUB = true

DELETE row4; -- row4.xmax = 200, row4.XMAX_IS_SUB = true

COMMIT; -- resets CurrentTransaction to 100 (pop from xact stack),
-- does *NOT* mark T200 as committed

BEGIN; -- starts a subtransaction, let's call it 300,
-- pushes 100 on the parent transaction stack,
-- inserts (300, 100) into pg_subtrans

BEGIN; -- starts a 3rd level subtransaction (400),
-- pushes 300 on the parent transaction stack,
-- inserts (400, 300) into pg_subtrans

...

COMMIT; -- resets CurrentTransaction to 300 (transaction stack),
-- does NOT mark T400 as committed

INSERT row5; -- row5.xmin = 300, row5.XMIN_IS_SUB = true

DELETE row6; -- row6.xmax = 300, row6.XMAX_IS_SUB = true

ROLLBACK; -- resets CurrentTransaction to 100 (transaction stack),
-- optionally removes (300, 100) from pg_subtrans,
-- marks T300 as aborted

COMMIT; -- marks T100 as committed
or
ROLLBACK; -- marks T100 as aborted

Visibility:
-----------

The checks for xmin and xmax are very similar. We look at xmin here:

Traditionally a tuple is visible, if xmin has committed before the
current snapshot was taken, or if xmin == CurrentTransaction().

A subtransaction is considered aborted, if it is marked aborted. Else
it is considered to be in the same state as its parent transaction
(which again can be a subtransaction).

The effects of tup.xmin are considered visible, if ...
(This is not a formal specification. It shall only illustrate the
difference to the existing implementation of HeapTupleSatisfiesXxx()
in tqual.c)

if (tup.XMIN_ABORTED) // flag set by prior visitor
return false;

if (tup.XMIN_COMMITTED) // flag set by prior visitor
return true;

// xmin neither known aborted nor known committed,
// could be active
// or finished and tup not yet visited
for (xmin = tup.xmin; IsValid(xmin); xmin = GetParentXact(xmin)) {
if (TransactionIdDidAbort(xmin)) {
tup.XMIN_ABORTED = true;
return false;
}/*if*/

if (IsCurrentTransaction(xmin)) {
// tup.xmin is one of my own subtransactions,
// it is already committed. So tup can be
// considered belonging to the current transaction.
tup.xmin = xmin;
tup.XMIN_IS_SUB = CurrentTransactionIsSub();
return true; // or rather check cmin ...
}/*if*/

if (TransactionIdDidCommit(xmin)) {
// xmin is a top level transaction
tup.xmin = xmin;
tup.XMIN_IS_SUB = false;
tup.XMIN_COMMITTED = true;
return true;
}/*if*/

if (!tup.XMIN_IS_SUB) {
// Don't try expensive GetParentXact()
break;
}/*if*/
}/*for*/

// tup.xmin still active
return false;

TransactionId GetParentXact(TransactionId xnum) uses pg_subtrans to
find the parent transaction of xnum. It returns InvalidTransaction,
if it doesn't find one.

Performance:
------------

. Zero overhead, if nested transactions are not used.

. BEGIN SUB has to insert a pair of TransactionIds into pg_subtrans.
Apart from that it is not slower than BEGIN top level transaction.

. COMMIT SUB is faster than COMMIT.

. ROLLBACK SUB is much like ROLLBACK, plus (optionally) deletes one
entry from pg_subtrans.

. COMMIT and ROLLBACK of top level transactions don't care about
subtransactions.

. Access a tuple inserted/deleted by a subtransaction: Zero
overhead, if the subtransaction has been rolled back, otherwise the
parent transaction has to be looked up in pg_subtrans (possibly
recursive). This price has to be paid only once per tuple (well, once
for xmin and once for xmax). More accurate: "once after the
inserting/deleting top level transaction has finished".

Problems:
---------

. pg_subtrans grows by 8 bytes per subtransaction.

. Other pitfalls???

Administration:
---------------

As soon as a top level transaction has finished, its subtransaction
ids are replaced by the top level transaction id on the next access to
each tuple.

VACUUM (*not* VACUUM tablename) removes old entries from pg_subtrans.
An entry is old, if the parent transaction has finished, before VACUUM
started.

Challenge:
----------

For heavy use of subtransactions there has to be a really fast
implementation of pg_subtrans, maybe something like a b-tree.

AFAICS small WAL changes: pg_subtrans inserts (and deletes?) have to
be logged.

Everything else is straightforward.

Comments?

Servus
Manfred

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message mlw 2002-05-10 11:13:04 Re: Threads vs processes - The Apache Way (Re: Path to PostgreSQL
Previous Message Jan Wieck 2002-05-10 10:29:15 Re: Queries using rules show no rows modified?