Question concerning XTM (eXtensible Transaction Manager API)

From: Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>
To: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Question concerning XTM (eXtensible Transaction Manager API)
Date: 2015-11-16 08:47:09
Message-ID: 5649980D.40103@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello,

Some time ago at PgConn.Vienna we have proposed eXtensible Transaction
Manager API (XTM).
The idea is to be able to provide custom implementation of transaction
managers as standard Postgres extensions,
primary goal is implementation of distritibuted transaction manager.
It should not only support 2PC, but also provide consistent snapshots
for global transaction executed at different nodes.

Actually, current version of XTM API propose any particular 2PC model.
It can be implemented either at coordinator side
(as it is done in our pg_tsdtm <https://github.com/postgrespro/pg_tsdtm>
implementation based on timestamps and not requiring centralized
arbiter), either by arbiter
(pg_dtm <https://github.com/postgrespro/pg_dtm>). In the last case 2PC
logic is hidden under XTM SetTransactionStatus method:

bool (*SetTransactionStatus)(TransactionId xid, int nsubxids,
TransactionId *subxids, XidStatus status, XLogRecPtr lsn);

which encapsulates TransactionIdSetTreeStatus in clog.c.
But you may notice that original TransactionIdSetTreeStatus function is
void - it is not intended to return anything.
It is called in RecordTransactionCommit in critical section where it is
not expected that commit may fail.
But in case of DTM transaction may be rejected by arbiter. XTM API
allows to control access to CLOG, so everybody will see that transaction
is aborted. But we in any case have to somehow notify client about abort
of transaction.

We can not just call elog(ERROR,...) in SetTransactionStatus
implementation because inside critical section it cause Postgres crash
with panic message. So we have to remember that transaction is rejected
and report error later after exit from critical section:

/*
* Now we may update the CLOG, if we wrote a COMMIT record above
*/
if (markXidCommitted) {
committed = TransactionIdCommitTree(xid, nchildren, children);
}
...
/*
* If we entered a commit critical section, leave it now, and let
* checkpoints proceed.
*/
if (markXidCommitted)
{
MyPgXact->delayChkpt = false;
END_CRIT_SECTION();
if (!committed) {
CurrentTransactionState->state = TRANS_ABORT;
CurrentTransactionState->blockState = TBLOCK_ABORT_PENDING;
elog(ERROR, "Transaction commit rejected by XTM");
}
}

There is one more problem - at this moment the state of transaction is
TRANS_COMMIT.
If ERROR handler will try to abort it, then we get yet another fatal
error: attempt to rollback committed transaction.
So we need to hide the fact that transaction is actually committed in
local XLOG.

This approach works but looks a little bit like hacker approach. It
requires not only to replace direct call of TransactionIdSetTreeStatus
with indirect (though XTM API), but also requires to make some non
obvious changes in RecordTransactionCommit.

So what are the alternatives?

1. Move RecordTransactionCommit to XTM. In this case we have to copy
original RecordTransactionCommit to DTM implementation and patch it
here. It is also not nice, because it will complicate maintenance of DTM
implementation.
The primary idea of XTM is to allow development of DTM as standard
PostgreSQL extension without creating of specific clones of main
PostgreSQL source tree. But this idea will be compromised if we have
copy&paste some pieces of PostgreSQL code.
In some sense it is even worser than maintaining separate branch - in
last case at least we have some way to perfrtom automatic merge.

2. Propose some alternative two-phase commit implementation in
PostgreSQL core. The main motivation for such "lightweight"
implementation of 2PC in pg_dtm is that original mechanism of prepared
transactions in PostgreSQL adds to much overhead.
In our benchmarks we have found that simple credit-debit banking test
(without any DTM) works almost 10 times slower with PostgreSQL 2PC than
without it. This is why we try to propose alternative solution (right
now pg_dtm is 2 times slower than vanilla PostgreSQL, but it not only
performs 2PC but also provide consistent snapshots).

May be somebody can suggest some other solution?
Or give some comments concerning current approach?

Thank in advance,
Konstantin,
Postgres Professional

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro HORIGUCHI 2015-11-16 09:05:37 Conversion error of floating point numbers in pl/pgsql
Previous Message Noah Misch 2015-11-16 08:20:57 Re: Rework the way multixact truncations work