From: | Manfred Koizar <mkoi-pg(at)aon(dot)at> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Nested transactions RFC |
Date: | 2002-05-10 11:12:21 |
Message-ID: | riandu8d681u68h1grteqo7nvh6hmtilis@4ax.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
if it is acceptable for subtransactions to use up transaction numbers,
then here is a half baked RFC for a possible implementation.
If not, forget the rest of this message.
The proposed implementation works much like the current transaction
handling. It needs an additional system table
pg_subtrans (child XactId PRIMARY KEY, parent XactId).
BEGIN; -- starts a new (top level) transaction, say 100
INSERT row1; -- row1.xmin = 100
DELETE row2; -- row2.xmax = 100
BEGIN; -- starts a subtransaction, let's call it 200,
-- stores 100 on the parent transaction stack
-- (a local memory structure),
-- inserts (200, 100) into pg_subtrans
INSERT row3; -- row3.xmin = 200, row3.XMIN_IS_SUB = true
DELETE row4; -- row4.xmax = 200, row4.XMAX_IS_SUB = true
COMMIT; -- resets CurrentTransaction to 100 (pop from xact stack),
-- does *NOT* mark T200 as committed
BEGIN; -- starts a subtransaction, let's call it 300,
-- pushes 100 on the parent transaction stack,
-- inserts (300, 100) into pg_subtrans
BEGIN; -- starts a 3rd level subtransaction (400),
-- pushes 300 on the parent transaction stack,
-- inserts (400, 300) into pg_subtrans
...
COMMIT; -- resets CurrentTransaction to 300 (transaction stack),
-- does NOT mark T400 as committed
INSERT row5; -- row5.xmin = 300, row5.XMIN_IS_SUB = true
DELETE row6; -- row6.xmax = 300, row6.XMAX_IS_SUB = true
ROLLBACK; -- resets CurrentTransaction to 100 (transaction stack),
-- optionally removes (300, 100) from pg_subtrans,
-- marks T300 as aborted
COMMIT; -- marks T100 as committed
or
ROLLBACK; -- marks T100 as aborted
Visibility:
-----------
The checks for xmin and xmax are very similar. We look at xmin here:
Traditionally a tuple is visible, if xmin has committed before the
current snapshot was taken, or if xmin == CurrentTransaction().
A subtransaction is considered aborted, if it is marked aborted. Else
it is considered to be in the same state as its parent transaction
(which again can be a subtransaction).
The effects of tup.xmin are considered visible, if ...
(This is not a formal specification. It shall only illustrate the
difference to the existing implementation of HeapTupleSatisfiesXxx()
in tqual.c)
if (tup.XMIN_ABORTED) // flag set by prior visitor
return false;
if (tup.XMIN_COMMITTED) // flag set by prior visitor
return true;
// xmin neither known aborted nor known committed,
// could be active
// or finished and tup not yet visited
for (xmin = tup.xmin; IsValid(xmin); xmin = GetParentXact(xmin)) {
if (TransactionIdDidAbort(xmin)) {
tup.XMIN_ABORTED = true;
return false;
}/*if*/
if (IsCurrentTransaction(xmin)) {
// tup.xmin is one of my own subtransactions,
// it is already committed. So tup can be
// considered belonging to the current transaction.
tup.xmin = xmin;
tup.XMIN_IS_SUB = CurrentTransactionIsSub();
return true; // or rather check cmin ...
}/*if*/
if (TransactionIdDidCommit(xmin)) {
// xmin is a top level transaction
tup.xmin = xmin;
tup.XMIN_IS_SUB = false;
tup.XMIN_COMMITTED = true;
return true;
}/*if*/
if (!tup.XMIN_IS_SUB) {
// Don't try expensive GetParentXact()
break;
}/*if*/
}/*for*/
// tup.xmin still active
return false;
TransactionId GetParentXact(TransactionId xnum) uses pg_subtrans to
find the parent transaction of xnum. It returns InvalidTransaction,
if it doesn't find one.
Performance:
------------
. Zero overhead, if nested transactions are not used.
. BEGIN SUB has to insert a pair of TransactionIds into pg_subtrans.
Apart from that it is not slower than BEGIN top level transaction.
. COMMIT SUB is faster than COMMIT.
. ROLLBACK SUB is much like ROLLBACK, plus (optionally) deletes one
entry from pg_subtrans.
. COMMIT and ROLLBACK of top level transactions don't care about
subtransactions.
. Access a tuple inserted/deleted by a subtransaction: Zero
overhead, if the subtransaction has been rolled back, otherwise the
parent transaction has to be looked up in pg_subtrans (possibly
recursive). This price has to be paid only once per tuple (well, once
for xmin and once for xmax). More accurate: "once after the
inserting/deleting top level transaction has finished".
Problems:
---------
. pg_subtrans grows by 8 bytes per subtransaction.
. Other pitfalls???
Administration:
---------------
As soon as a top level transaction has finished, its subtransaction
ids are replaced by the top level transaction id on the next access to
each tuple.
VACUUM (*not* VACUUM tablename) removes old entries from pg_subtrans.
An entry is old, if the parent transaction has finished, before VACUUM
started.
Challenge:
----------
For heavy use of subtransactions there has to be a really fast
implementation of pg_subtrans, maybe something like a b-tree.
AFAICS small WAL changes: pg_subtrans inserts (and deletes?) have to
be logged.
Everything else is straightforward.
Comments?
Servus
Manfred
From | Date | Subject | |
---|---|---|---|
Next Message | mlw | 2002-05-10 11:13:04 | Re: Threads vs processes - The Apache Way (Re: Path to PostgreSQL |
Previous Message | Jan Wieck | 2002-05-10 10:29:15 | Re: Queries using rules show no rows modified? |