Re: [HACKERS] logical decoding of two-phase transactions

From: Arseny Sher <a(dot)sher(at)postgrespro(dot)ru>
To: Nikhil Sontakke <nikhils(at)2ndquadrant(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, David Steele <david(at)pgmasters(dot)net>, Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Sokolov Yura <y(dot)sokolov(at)postgrespro(dot)ru>, Stas Kelvich <s(dot)kelvich(at)postgrespro(dot)ru>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] logical decoding of two-phase transactions
Date: 2018-08-06 18:06:13
Message-ID: 877el38j56.fsf@ars-thinkpad
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello,

I have looked through the patches. I will first describe relativaly
serious issues I see and then proceed with small nitpicking.

- On decoding of aborted xacts. The idea to throw an error once we
detect the abort is appealing, however I think you will have problems
with subxacts in the current implementation. What if subxact issues
DDL and then aborted, but main transaction successfully committed?

- Decoding transactions at PREPARE record changes rules of the "we ship
all commits after lsn 'x'" game. Namely, it will break initial
tablesync: what if consistent snapshot was formed *after* PREPARE, but
before COMMIT PREPARED, and the plugin decides to employ 2pc? Instead
of getting inital contents + continious stream of changes the receiver
will miss the prepared xact contents and raise 'prepared xact doesn't
exist' error. I think the starting point to address this is to forbid
two-phase decoding of xacts with lsn of PREPARE less than
snapbuilder's start_decoding_at.

- Currently we will call abort_prepared cb even if we failed to actually
prepare xact due to concurrent abort. I think it is confusing for
users. We should either handle this by remembering not to invoke
abort_prepared in these cases or at least document this behaviour,
leaving this problem to the receiver side.

- I find it suspicious that DecodePrepare completely ignores actions of
SnapBuildCommitTxn. For example, to execute invalidations, the latter
sets base snapshot if our xact (or subxacts) did DDL and the snapshot
not set yet. My fantasy doesn't hint me the concrete example
where this would burn at the moment, but it should be considered.

Now, the bikeshedding.

First patch:
- I am one of those people upthread who don't think that converting
flags to bitmask is beneficial -- especially given that many of them
are mutually exclusive, e.g. xact can't be committed and aborted at
the same time. Apparently you have left this to the committer though.

Second patch:
- Applying gives me
Applying: Support decoding of two-phase transactions at PREPARE
.git/rebase-apply/patch:871: trailing whitespace.

+ row. The <function>change_cb</function> callback may access system or
+ user catalog tables to aid in the process of outputting the row
+ modification details. In case of decoding a prepared (but yet
+ uncommitted) transaction or decoding of an uncommitted transaction, this
+ change callback is ensured sane access to catalog tables regardless of
+ simultaneous rollback by another backend of this very same transaction.

I don't think we should explain this, at least in such words. As
mentioned upthread, we should warn about allowed systable_* accesses
instead. Same for message_cb.

+ /*
+ * Tell the reorderbuffer about the surviving subtransactions. We need to
+ * do this because the main transaction itself has not committed since we
+ * are in the prepare phase right now. So we need to be sure the snapshot
+ * is setup correctly for the main transaction in case all changes
+ * happened in subtransanctions
+ */

While we do certainly need to associate subxacts here, the explanation
looks weird to me. I would leave just the 'Tell the reorderbuffer about
the surviving subtransactions' as in DecodeCommit.

}
-
/*
* There's a speculative insertion remaining, just clean in up, it
* can't have been successful, otherwise we'd gotten a confirmation

Spurious newline deletion.

- I would rename ReorderBufferCommitInternal to ReorderBufferReplay:
we replay the xact there, not commit.

- If xact is empty, we will not prepare it (and call cb),
even if the output plugin asked us. However, we will call
commit_prepared cb.

- ReorderBufferTxnIsPrepared and ReorderBufferPrepareNeedSkip do the
same and should be merged with comments explaining that the answer
must be stable.

- filter_prepare_cb callback existence is checked in both decode.c and
in filter_prepare_cb_wrapper.

+ /*
+ * The transaction may or may not exist (during restarts for example).
+ * Anyways, 2PC transactions do not contain any reorderbuffers. So allow
+ * it to be created below.
+ */

Code around looks sane, but I think that ReorderBufferTXN for our xact
must *not* exist at this moment: if we are going to COMMIT/ABORT
PREPARED it, it must have been replayed and RBTXN purged immediately
after. Also, instead of misty '2PC transactions do not contain any
reorderbuffers' I would say something like 'create dummy
ReorderBufferTXN to pass it to the callback'.

- In DecodeAbort:
+ /*
+ * If it's ROLLBACK PREPARED then handle it via callbacks.
+ */
+ if (TransactionIdIsValid(xid) &&
+ !SnapBuildXactNeedsSkip(ctx->snapshot_builder, buf->origptr) &&
+

How xid can be invalid here?

- It might be worthwile to put the check
+ !SnapBuildXactNeedsSkip(ctx->snapshot_builder, buf->origptr) &&
+ parsed->dbId == ctx->slot->data.database &&
+ !FilterByOrigin(ctx, origin_id) &&

which appears 3 times now into separate function.

+ * two-phase transactions - we either have to have all of them or none.
+ * The filter_prepare callback is optional, but can only be defined when

Kind of controversial (all of them or none, but optional), might be
formulated more accurately.

+ /*
+ * Capabilities of the output plugin.
+ */
+ bool enable_twophase;

I would rename this to 'supports_twophase' since this is not an option
but a description of the plugin capabilities.

+ /* filter_prepare is optional, but requires two-phase decoding */
+ if ((ctx->callbacks.filter_prepare_cb != NULL) && (!ctx->enable_twophase))
+ ereport(ERROR,
+ (errmsg("Output plugin does not support two-phase decoding, but "
+ "registered
filter_prepared callback.")));

Don't think we need to check that...

+ * Otherwise call either PREPARE (for twophase transactions) or COMMIT
+ * (for regular ones).
+ */
+ if (rbtxn_rollback(txn))
+ rb->abort(rb, txn, commit_lsn);

This is the dead code since we don't have decoding of in-progress xacts
yet.

Third patch:
+/*
+ * An xid value pointing to a possibly ongoing or a prepared transaction.
+ * Currently used in logical decoding. It's possible that such transactions
+ * can get aborted while the decoding is ongoing.
+ */

I would explain here that this xid is checked for abort after each
catalog scan, and sent for the details to SetupHistoricSnapshot.

+ /*
+ * If CheckXidAlive is valid, then we check if it aborted. If it did, we
+ * error out
+ */
+ if (TransactionIdIsValid(CheckXidAlive) &&
+ !TransactionIdIsInProgress(CheckXidAlive) &&
+ !TransactionIdDidCommit(CheckXidAlive))
+ ereport(ERROR,
+ (errcode(ERRCODE_TRANSACTION_ROLLBACK),
+ errmsg("transaction aborted during system catalog scan")));

Probably centralize checks in one function? As well as 'We don't expect
direct calls to heap_fetch...' ones.

P.S. Looks like you have torn the thread chain: In-Reply-To header of
mail [1] is missing. Please don't do that.

[1] https://www.postgresql.org/message-id/CAMGcDxeqEpWj3fTXwqhSwBdXd2RS9jzwWscO-XbeCfso6ts3%2BQ%40mail.gmail.com

--
Arseny Sher
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrey Borodin 2018-08-06 18:12:00 Re: GiST VACUUM
Previous Message Stephen Frost 2018-08-06 17:12:46 Re: Standby trying "restore_command" before local WAL