Skip site navigation (1) Skip section navigation (2)

Re: nested transactions

From: Manfred Koizar <mkoi-pg(at)aon(dot)at>
To: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: nested transactions
Date: 2002-11-29 17:03:56
Message-ID: 72ueuukn2vleinke8008vsbcd8o7kqkd2n@4ax.com (view raw or flat)
Thread:
Lists: pgsql-hackers
On Thu, 28 Nov 2002 12:59:21 -0500 (EST), Bruce Momjian
<pgman(at)candle(dot)pha(dot)pa(dot)us> wrote:
>Yes, locking is one possible solution, but no one likes that.  One hack
>lock idea would be to create a subtransaction-only lock, [...]
>
>> [...] without
>> having to touch the xids in the tuple headers.
>
>Yes, you could do that, but we can easily just set the clog bits
>atomically,

From what I read above I don't think we can *easily* set more than one
transaction's bits atomically.

> and it will not be needed --- the tuple bits really don't
>help us, I think.

Yes, this is what I said, or at least tried to say.  I just wanted to
make clear how this new approach (use the fourth status) differs from
older proposals (replace subtransaction ids in tuple headers).

>OK, we put it in a file.  And how do we efficiently clean it up?
>Remember, it is only to be used for a _brief_ period of time.  I think a
>file system solution is doable if we can figure out a way not to create
>a file for every xid.

I don't want to create one file for every transaction, but rather a
huge (sparse) array of parent xids.  This array is divided into
manageable chunks, represented by files, "pg_subtrans_NNNN".  These
files are only created when necessary.  At any time only a tiny part
of the whole array is kept in shared buffers.  This concept is similar
or almost equal to pg_clog, which is an array of doublebits.

>Maybe we write the xid's to a file in a special directory in sorted
>order, and backends can do a btree search of each file in that directory
>looking for the xid, and then knowing the master xid, look up that
>status, and once all the children xid's are updated, you delete the
>file.

Yes, dense arrays or btrees are other possible implementations.  But
for simplicity I'd do it pg_clog style.

>Yes, but again, the xid status of subtransactions is only update just
>before commit of the main transaction, so there is little value to
>having those visible.

Having them visible solves the atomicity problem without requiring
long locks.  Updating the status of a single (main or sub) transaction
is atomic, just like it is now.

Here is what is to be done for some operations:

BEGIN main transaction:
	Get a new xid (no change to current behaviour).
	pg_clog[xid] is still 00, meaning active.
	pg_subtrans[xid] is still 0, meaning no parent.

BEGIN subtransaction:
	Push current transaction info onto local stack.
	Get a new xid.
	Record parent xid in pg_subtrans[xid].
	pg_clog[xid] is still 00.

ROLLBACK subtransaction:
	Set pg_clog[xid] to 10 (aborted).
	Optionally set clog bits for subsubtransactions to 10.
	Pop transaction info from stack.

COMMIT subtransaction:
	Set pg_clog[xid] to 11 (committed subtrans).
	Don't touch clog bits for subsubtransactions!
	Pop transaction info from stack.

ROLLBACK main transaction:
	Set pg_clog[xid] to 10 (aborted).
	Optionally set clog bits for subtransactions to 10.
	
COMMIT main transaction:
	Set pg_clog[xid] to 01 (committed).
	Optionally set clog bits for subtransactions from 11 to 01.
	Don't touch clog bits for aborted subtransactions!

Visibility check by other transactions:  If a tuple is visited and its
XMIN/XMAX_IS_COMMITTED/ABORTED flags are not yet set, pg_clog has to
be consulted to find out the status of the inserting/deleting
transaction xid.  If pg_clog[xid] is ...

	00:  transaction still active

	10:  aborted

	01:  committed

	11:  committed subtransaction, have to check parent

Only in this last case do we have to get parentxid from pg_subtrans.
Now we look at pg_clog[parentxid].  If we find ...

	00:  parent still active, so xid is considered active, too

	10:  parent aborted, so xid is considered aborted,
	     optionally set pg_clog[xid] = 10

	01:  parent committed, so xid is considered committed,
	     optionally set pg_clog[xid] = 01

	11:  recursively check grandparent(s) ...

For brevity the following operations are not covered in detail:
. Visibility checks for tuples inserted/deleted by a (sub)transaction
belonging to the current transaction tree (have to check local
transaction stack whenever we look at a xid or switch to a parent xid)
. HeapTupleSatisfiesUpdate (sometimes has to wait for parent
transaction)

The trick here is, that subtransaction status is immediately updated
in pg_clog on commit/abort.  Main transaction commit is atomic (just
set its commit bit).  Status 11 is short-lived, it is replaced with
the final status by one or more of

	- COMMIT/ROLLBACK of the main transaction
	- a later visibility check (as a side effect)
	- VACUUM

pg_subtrans cleanup:  A pg_subtrans_NNNN file covers a known range of
transaction ids.  As soon as none of these transactions has a pg_clog
status of 11, the pg_subtrans_NNNN file can be removed.  VACUUM can do
this, and it won't even have to check the heap.

Servus
 Manfred

In response to

Responses

pgsql-hackers by date

Next:From: Joe ConwayDate: 2002-11-29 17:14:07
Subject: Re: One SQL to access two databases.
Previous:From: wadeDate: 2002-11-29 16:36:10
Subject: Re: Query performance. 7.2.3 Vs. 7.3

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group