Re: [HACKERS] logical decoding of two-phase transactions

From: Ajin Cherian <itsajin(at)gmail(dot)com>
To: Peter Smith <smithpb2250(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] logical decoding of two-phase transactions
Date: 2020-10-23 10:10:35
Message-ID: CAFPTHDZoV3B6fthfkNDcdoCYftG5T219EAJ-9sneDPdQjD_Z=A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Oct 20, 2020 at 3:15 PM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
>
> Hello Ajin.
>
> I have gone through the v10 patches to verify if and how my previous
> v6 review comments got addressed.
>
> Some issues remain, and there are a few newly introduced ones.
>
> Mostly it is all very minor stuff.
>
> Please find my revised review comments below.
>
> Kind Regards.
> Peter Smith
> Fujitsu Australia
>
> ---
>
> V10 REVIEW COMMENTS FOLLOW
>
> ==========
> Patch v10-0001, File: contrib/test_decoding/test_decoding.c
> ==========
>
> COMMENT
> Line 285
> + {
> + errno = 0;
> + data->check_xid_aborted = (TransactionId)
> + strtoul(strVal(elem->arg), NULL, 0);
> +
> + if (!TransactionIdIsValid(data->check_xid_aborted))
> + ereport(ERROR,
> + (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> + errmsg("check-xid-aborted is not a valid xid: \"%s\"",
> + strVal(elem->arg))));
> + }
>
>
> I think it is risky to assign strtoul directly to the
> check_xid_aborted member because it makes some internal assumption
> that the invalid transaction is the same as the error return from
> strtoul.
>
> Maybe better to do in 2 steps like below:
>
> BEFORE
> errno = 0;
> data->check_xid_aborted = (TransactionId)strtoul(strVal(elem->arg), NULL, 0);
>
> AFTER
> long xid;
> errno = 0;
> xid = strtoul(strVal(elem->arg), NULL, 0);
> if (xid == 0 || errno != 0)
> data->check_xid_aborted = InvalidTransactionId;
> else
> data->check_xid_aborted =(TransactionId)xid;
>
> ---
Updated accordingly.

>
> COMMENT
> Line 430
> +
> +/* ABORT PREPARED callback */
> +static void
> +pg_decode_rollback_prepared_txn(LogicalDecodingContext *ctx,
> ReorderBufferTXN *txn,
> + XLogRecPtr abort_lsn)
>
>
> Fix comment "ABORT PREPARED" --> "ROLLBACK PREPARED"

Updated accordingly.

>
> ==========
> Patch v10-0001, File: doc/src/sgml/logicaldecoding.sgml
> ==========
>
> COMMENT
> Section 48.6.1
> Says:
> An output plugin may also define functions to support streaming of
> large, in-progress transactions. The stream_start_cb, stream_stop_cb,
> stream_abort_cb, stream_commit_cb and stream_change_cb are required,
> while stream_message_cb, stream_prepare_cb, stream_commit_prepared_cb,
> stream_rollback_prepared_cb and stream_truncate_cb are optional.
>
> An output plugin may also define functions to support two-phase
> commits, which are decoded on PREPARE TRANSACTION. The prepare_cb,
> commit_prepared_cb and rollback_prepared_cb callbacks are required,
> while filter_prepare_cb is optional.
>
> -
>
> But is that correct? It seems strange/inconsistent to say that the 2PC
> callbacks are mandatory for the non-streaming, but that they are
> optional for streaming.
>

Updated making all the 2PC callbacks mandatory.

> ---
>
> COMMENT
> 48.6.4.5 "Transaction Prepare Callback"
> 48.6.4.6 "Transaction Commit Prepared Callback"
> 48.6.4.7 "Transaction Rollback Prepared Callback"
>
> There seems some confusion about what is optional and what is
> mandatory. e.g. Why are the non-stream 2PC callbacks mandatory but the
> stream 2PC callbacks are not? And also there is some inconsistency
> with what is said in the paragraph at the top of the page versus what
> each of the callback sections says wrt optional/mandatory.
>
> The sub-sections 49.6.4.5, 49.6.4.6, 49.6.4.7 say those callbacks are
> optional which IIUC Amit said is incorrect. This is similar to the
> previous review comment
>
> ---

Updated making all the 2PC callbacks mandatory.

>
> COMMENT
> Section 48.6.4.7 "Transaction Rollback Prepared Callback"
>
> parameter "abort_lsn" probably should be "rollback_lsn"
>
> ---
>
> COMMENT
> Section 49.6.4.18. "Stream Rollback Prepared Callback"
> Says:
> The stream_rollback_prepared_cb callback is called to abort a
> previously streamed transaction as part of a two-phase commit.
>
> maybe should say "is called to rollback"
>
> ==========
> Patch v10-0001, File: src/backend/replication/logical/logical.c
> ==========
>
> COMMENT
> Line 252
> Says: We however enable two phase logical...
>
> "two phase" --> "two-phase"
>
> --
>
> COMMENT
> Line 885
> Line 923
> Says: If the plugin support 2 phase commits...
>
> "support 2 phase" --> "supports two-phase" in the comment. Same issue
> occurs twice.
>
> ---
>
> COMMENT
> Line 830
> Line 868
> Line 906
> Says:
> /* We're only supposed to call this when two-phase commits are supported */
>
> There is an extra space between the "are" and "supported" in the comment.
> Same issue occurs 3 times.
>
> ---
>
> COMMENT
> Line 1023
> + /*
> + * Skip if decoding of two-phase at PREPARE time is not enabled. In that
> + * case all two-phase transactions are considered filtered out and will be
> + * applied as regular transactions at COMMIT PREPARED.
> + */
>
> Comment still is missing the word "transactions"
> "Skip if decoding of two-phase at PREPARE time is not enabled."
> -> "Skip if decoding of two-phase transactions at PREPARE time is not enabled.
>

Updated accordingly.

> ==========
> Patch v10-0001, File: src/include/replication/reorderbuffer.h
> ==========
>
> COMMENT
> Line 459
> /* abort prepared callback signature */
> typedef void (*ReorderBufferRollbackPreparedCB) (
> ReorderBuffer *rb,
> ReorderBufferTXN *txn,
> XLogRecPtr abort_lsn);
>
> There is no alignment consistency here for
> ReorderBufferRollbackPreparedCB. Some function args are directly under
> the "(" and some are on the same line. This function code is neither.
>
> ---
>
> COMMENT
> Line 638
> @@ -431,6 +486,24 @@ typedef void (*ReorderBufferStreamAbortCB) (
> ReorderBufferTXN *txn,
> XLogRecPtr abort_lsn);
>
> +/* prepare streamed transaction callback signature */
> +typedef void (*ReorderBufferStreamPrepareCB) (
> + ReorderBuffer *rb,
> + ReorderBufferTXN *txn,
> + XLogRecPtr prepare_lsn);
> +
> +/* prepare streamed transaction callback signature */
> +typedef void (*ReorderBufferStreamCommitPreparedCB) (
> + ReorderBuffer *rb,
> + ReorderBufferTXN *txn,
> + XLogRecPtr commit_lsn);
> +
> +/* prepare streamed transaction callback signature */
> +typedef void (*ReorderBufferStreamRollbackPreparedCB) (
> + ReorderBuffer *rb,
> + ReorderBufferTXN *txn,
> + XLogRecPtr rollback_lsn);
>
> There is no inconsistent alignment with the arguments (compare how
> other functions are aligned)
>
> See:
> - for ReorderBufferStreamCommitPreparedCB
> - for ReorderBufferStreamRollbackPreparedCB
> - for ReorderBufferPrepareNeedSkip
> - for ReorderBufferTxnIsPrepared
> - for ReorderBufferPrepare
>
> ---
>
> COMMENT
> Line 489
> Line 495
> Line 501
> /* prepare streamed transaction callback signature */
>
> Same comment cut/paste 3 times?
> - for ReorderBufferStreamPrepareCB
> - for ReorderBufferStreamCommitPreparedCB
> - for ReorderBufferStreamRollbackPreparedCB
>
> ---
>
> COMMENT
> Line 457
> /* abort prepared callback signature */
> typedef void (*ReorderBufferRollbackPreparedCB) (
> ReorderBuffer *rb,
> ReorderBufferTXN *txn,
> XLogRecPtr abort_lsn);
>
> "abort" --> "rollback" in the function comment.
>
> ---
>
> COMMENT
> Line 269
> /* In case of 2PC we need to pass GID to output plugin */
>
> "2PC" --> "two-phase commit"
>

Updated accordingly.

> ==========
> Patch v10-0002, File: contrib/test_decoding/expected/two_phase.out (and .sql)
> ==========
>
> COMMENT
> General
>
> It is a bit hard to see what are the main tests here are what are just
> sub-parts of some test case.
>
> e.g. It seems like the main tests are.
>
> 1. Test that decoding happens at PREPARE time
> 2. Test decoding of an aborted tx
> 3. Test a prepared tx which contains some DDL
> 4. Test decoding works while an uncommitted prepared tx with DDL exists
> 5. Test operations holding exclusive locks won't block decoding
> 6. Test savepoints and sub-transactions
> 7. Test "_nodecode" will defer the decoding until the commit time
>
> Can the comments be made more obvious so it is easy to distinguish the
> main tests from the steps of those tests?
>
> ---
>
> COMMENT
> Line 1
> -- Test two-phased transactions, when two-phase-commit is enabled,
> transactions are
> -- decoded at PREPARE time rather than at COMMIT PREPARED time.
>
> Some commas to be removed and this comment to be split into several sentences.
>
> ---
>
> COMMENT
> Line 19
> -- should show nothing
>
> Comment could be more informative. E.g. "Should show nothing because
> the PREPARE has not happened yet"
>
> ---
>
> COMMENT
> Line 77
>
> Looks like there is a missing comment about here that should say
> something like "Show that the DDL does not appear in the decoding"
>
> ---
>
> COMMENT
> Line 160
> -- test savepoints and sub-xacts as a result
>
> The subsequent test is testing savepoints. But is it testing sub
> transactions like the comment says?
>

Updated accordingly.

> ==========
> Patch v10-0002, File: contrib/test_decoding/t/001_twophase.pl
> ==========
>
> COMMENT
> General
>
> I think basically there are only 2 tests in this file.
> 1. to check that the concurrent abort works.
> 2. to check that the prepared tx can span a server shutdown/restart
>
> But the tests comments do not make this clear at all.
> e.g. All the "#" comments look equally important although most of them
> are just steps of each test case.
> Can the comments be better to distinguish the tests versus the steps
> of each test?
>

Updated accordingly.

> ==========
> Patch v10-0002, File: src/backend/replication/logical/decode.c
> ==========
>
> COMMENT
> Line 71
> static void DecodeCommitPrepared(LogicalDecodingContext *ctx,
> XLogRecordBuffer *buf,
> xl_xact_parsed_commit *parsed, TransactionId xid);
> static void DecodeAbort(LogicalDecodingContext *ctx, XLogRecordBuffer *buf,
> xl_xact_parsed_abort *parsed, TransactionId xid);
> static void DecodeAbortPrepared(LogicalDecodingContext *ctx,
> XLogRecordBuffer *buf,
> xl_xact_parsed_abort *parsed, TransactionId xid);
> static void DecodePrepare(LogicalDecodingContext *ctx, XLogRecordBuffer *buf,
> xl_xact_parsed_prepare * parsed);
>
> The 2nd line or args are not aligned properly.
> - for DecodeCommitPrepared
> - for DecodeAbortPrepared
> - for DecodePrepare
>

Updated accordingly.

> ==========
> Patch v10-0002, File: src/backend/replication/logical/reorderbuffer.c
> ==========
>
> COMMENT
> There are some parts of the code where in my v6 review I had a doubt
> about the mutually exclusive treatment of the "streaming" flag and the
> "rbtxn_prepared(txn)" state.
>
> Basically I did not see how some parts of the code are treating NOT
> streaming as implying 2PC etc because it defies my understanding that
> 2PC can also work in streaming mode. Perhaps the "streaming" flag has
> a different meaning to how I interpret it? Or perhaps some functions
> are guarding higher up and can only be called under certain
> conditions?
>
> Anyway, this confusion manifests in several parts of the code, none of
> which was changed after my v6 review.
>
> Affected code includes the following:
>
> CASE 1
> Wherever the ReorderBufferTruncateTXN(...) "prepared" flag (third
> parameter) is hardwired true/false, I think there must be some
> preceding Assert to guarantee the prepared state condition holds true.
> There can't be any room for doubts like "but what will it do for
> streamed 2PC..."
> Line 1805 - ReorderBufferTruncateTXN(rb, txn, true); // if rbtxn_prepared(txn)
> Line 1941 - ReorderBufferTruncateTXN(rb, txn, false); // state ??
> Line 2389 - ReorderBufferTruncateTXN(rb, txn, false); // if streaming
> Line 2396 - ReorderBufferTruncateTXN(rb, txn, true); // if not
> streaming and if rbtxm_prepared(txn)
> Line 2459 - ReorderBufferTruncateTXN(rb, txn, true); // if not streaming
>
> ~
>
> CASE 2
> Wherever the "streaming" flag is tested I don't really understand how
> NOT streaming can automatically imply 2PC.
> Line 2330 - if (streaming) // what about if it is streaming AND 2PC at
> the same time?
> Line 2387 - if (streaming) // what about if it is streaming AND 2PC at
> the same time?
> Line 2449 - if (streaming) // what about if it is streaming AND 2PC at
> the same time?
>
> ~
>
> Case 1 and Case 2 above overlap a fair bit. I just listed them so they
> all get checked again.
>
> Even if the code is thought to be currently OK I do still think
> something should be done like:
> a) add some more substantial comments to explain WHY the combination
> of streaming and 2PC is not valid in the context
> b) the Asserts to be strengthened to 100% guarantee that the streaming
> and prepared states really are exclusive (if indeed they are). For
> this point I thought the following Assert condition could be better:
> Assert(streaming || rbtxn_prepared(txn));
> Assert(stream_started || rbtxn_prepared(txn));
> because as it is you still are left wondering if both streaming AND
> rbtxn_prepared(txn) can be possible at the same time...
>
> ---

Updated with more comments and a new Assert.

>
> COMMENT
> Line 2634
> * Anyways, two-phase transactions do not contain any reorderbuffers.
>
> "Anyways" --> "Anyway"

Updated.

>
> ==========
> Patch v10-0003, File: src/backend/access/transam/twophase.c
> ==========
>
> COMMENT
> Line 557
> @@ -548,6 +548,33 @@ MarkAsPrepared(GlobalTransaction gxact, bool lock_held)
> }
>
> /*
> + * LookupGXact
> + * Check if the prepared transaction with the given GID is around
> + */
> +bool
> +LookupGXact(const char *gid)
> +{
> + int i;
> + bool found = false;
>
> The variable declarations (i and found) are not aligned.
>

Updated.

> ==========
> Patch v10-0003, File: src/backend/replication/logical/proto.c
> ==========
>
> COMMENT
> Line 125
> Line 205
> Assert(strlen(txn->gid) > 0);
>
> I suggested that the assertion should also check txn->gid is not NULL.
> You replied "In this case txn->gid has to be non NULL".
>
> But that is exactly what I said :-)
> If it HAS to be non-NULL then why not just Assert that in code instead
> of leaving the reader wondering?
>
> "Assert(strlen(txn->gid) > 0);" --> "Assert(tdx->gid && strlen(txn->gid) > 0);"
> Same occurs several times.
>
> ---

Updated checking that gid is non-NULL as zero strlen is actually a valid case.

>
> COMMENT
> Line 133
> Line 213
> if (rbtxn_commit_prepared(txn))
> flags |= LOGICALREP_IS_COMMIT_PREPARED;
> else if (rbtxn_rollback_prepared(txn))
> flags |= LOGICALREP_IS_ROLLBACK_PREPARED;
> else
> flags |= LOGICALREP_IS_PREPARE;
>
> Previously I wrote that the use of the bit flags on assignment in the
> logicalrep_write_prepare was inconsistent with the way they are
> treated when they are read. Really it should be using a direct
> assignment instead of bit flags.
>
> You said this is skipped anticipating a possible refactor. But IMO
> this leaves the code in a half/half state. I think it is better to fix
> it properly and if refactoring happens then deal with that at the
> time.
>
> The last comment I saw from Amit said to use my 1st proposal of direct
> assignment instead of bit flag assignment.
>
> (applies to both non-stream and stream functions)
> - see logicalrep_write_prepare
> - see logicalrep_write_stream_prepare
>
>

Updated accordingly.

> ==========
> Patch v10-0003, File: src/backend/replication/pgoutput/pgoutput.c
> ==========
>
> COMMENT
> Line 429
> /*
> * PREPARE callback
> */
> static void
> pgoutput_rollback_prepared_txn(LogicalDecodingContext *ctx,
> ReorderBufferTXN *txn,
> XLogRecPtr prepare_lsn)
> The function comment looks wrong.
> Shouldn't this comment say be "ROLLBACK PREPARED callback"?
>
> ==========
> Patch v10-0003, File: src/include/replication/logicalproto.h
> ==========
>
> Line 115
> #define PrepareFlagsAreValid(flags) \
> ((flags == LOGICALREP_IS_PREPARE) || \
> (flags == LOGICALREP_IS_COMMIT_PREPARED) || \
> (flags == LOGICALREP_IS_ROLLBACK_PREPARED))
>
> Would be safer if all the references to flags are in parentheses
> e.g. "flags" --> "(flags)"
>

Updated accordingly.

Amit,
I have also modified the stream callback APIs to not include
stream_commit_prpeared and stream_rollback_prepared, instead use the
non-stream APIs for the same functionality.
I have also updated the test_decoding and pgoutput plugins accordingly.

regards,
Ajin Cherian
Fujitsu Australia

Attachment Content-Type Size
v11-0003-Support-2PC-txn-pgoutput.patch application/octet-stream 48.5 KB
v11-0001-Support-2PC-txn-base.patch application/octet-stream 41.1 KB
v11-0002-Support-2PC-txn-backend-and-tests.patch application/octet-stream 47.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jürgen Purtz 2020-10-23 10:18:50 Re: [PATCH] Add section headings to index types doc
Previous Message John Naylor 2020-10-23 09:54:26 Re: speed up unicode decomposition and recomposition