Re: Question concerning XTM (eXtensible Transaction Manager API)

From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: konstantin knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>
Cc: Kevin Grittner <kgrittn(at)ymail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Question concerning XTM (eXtensible Transaction Manager API)
Date: 2015-11-17 18:50:36
Message-ID: 20151117185036.GE614468@alvherre.pgsql
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

konstantin knizhnik wrote:

> The transaction is normally committed in xlog, so that it can always be recovered in case of node fault.
> But before setting correspondent bit(s) in CLOG and releasing locks we first contact arbiter to get global status of transaction.
> If it is successfully locally committed by all nodes, then arbiter approves commit and commit of transaction normally completed.
> Otherwise arbiter rejects commit. In this case DTM marks transaction as aborted in CLOG and returns error to the client.
> XLOG is not changed and in case of failure PostgreSQL will try to replay this transaction.
> But during recovery it also tries to restore transaction status in CLOG.
> And at this placeDTM contacts arbiter to know status of transaction.
> If it is marked as aborted in arbiter's CLOG, then it wiull be also marked as aborted in local CLOG.
> And according to PostgreSQL visibility rules no other transaction will see changes made by this transaction.

One problem I see with this approach is that the WAL replay can happen
long after it was written; for instance you might have saved a
basebackup and WAL stream and replay it all several days or weeks later,
when the arbiter no longer has information about the XID. Later
transactions might (will) depend on the aborted state of the transaction
in question, so this effectively corrupts the database.

In other words, while it's reasonable to require that the arbiter can
always be contacted for transaction commit/abort at run time, but it's
not reasonable to contact the arbiter during WAL replay.

I think this merits more explanation:

> The transaction is normally committed in xlog, so that it can always be recovered in case of node fault.

Why would anyone want to "recover" a transaction that was aborted?

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2015-11-17 18:59:05 Re: Should TIDs be typbyval = FLOAT8PASSBYVAL to speed up CREATE INDEX CONCURRENTLY?
Previous Message Robert Haas 2015-11-17 18:39:03 Re: Foreign join pushdown vs EvalPlanQual