Re: [HACKERS] logical decoding of two-phase transactions

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: "Peter(dot)B(dot)Smith(at)fujitsu(dot)com" <Peter(dot)B(dot)Smith(at)fujitsu(dot)com>
Cc: Ajin Cherian <itsajin(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] logical decoding of two-phase transactions
Date: 2020-10-07 05:24:20
Message-ID: CAA4eK1J_WyGPnHvwwG41JFJTKFZztZKwZXBrGzMZ6dqwY_ZoJg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Oct 6, 2020 at 10:23 AM Peter(dot)B(dot)Smith(at)fujitsu(dot)com
<Peter(dot)B(dot)Smith(at)fujitsu(dot)com> wrote:
>
>
> [BEGIN]
>
> ==========
> Patch V6-0001, File: contrib/test_decoding/expected/prepared.out (so
> prepared.sql also)
> ==========
>
> COMMENT
> Line 30 - The INSERT INTO test_prepared1 VALUES (2); is kind of
> strange because it is not really part of the prior test nor the
> following test. Maybe it would be better to have a comment describing
> the purpose of this isolated INSERT and to also consume the data from
> the slot so it does not get jumbled with the data of the following
> (abort) test.
>
> ;
>
> COMMENT
> Line 53 - Same comment for this test INSERT INTO test_prepared1 VALUES
> (4); It kind of has nothing really to do with either the prior (abort)
> test nor the following (ddl) test.
>
> ;
>
> COMMENT
> Line 60 - Seems to check which locks are held for the test_prepared_1
> table while the transaction is in progress. Maybe it would be better
> to have more comments describing what is expected here and why.
>
> ;
>
> COMMENT
> Line 88 - There is a comment in the test saying "-- We should see '7'
> before '5' in our results since it commits first." but I did not see
> any test code that actually verifies that happens.
>
> ;

All the above comments are genuine and I think it is mostly because
the author has blindly modified the existing tests without completely
understanding the intent of the test. I suggest we write a completely
new regression file (decode_prepared.sql) for these and just copy
whatever is required from prepared.sql. Once we do that we might also
want to rename existing prepared.sql to decode_commit_prepared.sql or
something like that. I think modifying the existing test appears to be
quite ugly and also it is changing the intent of the existing tests.

>
> QUESTION
> Line 120 - I did not really understand the SQL checking the pg_class.
> I expected this would be checking table 'test_prepared1' instead. Can
> you explain it?
> SELECT 'pg_class' AS relation, locktype, mode
> FROM pg_locks
> WHERE locktype = 'relation'
> AND relation = 'pg_class'::regclass;
> relation | locktype | mode
> ----------+----------+------
> (0 rows)
>
> ;

Yes, I also think your expectation is correct and this should be on
'test_prepared_1'.

>
> QUESTION
> Line 139 - SET statement_timeout = '1s'; is 1 seconds short enough
> here for this test, or might it be that these statements would be
> completed in less than one seconds anyhow?
>
> ;

Good question. I think we have to mention the reason why logical
decoding is not blocked while it needs to acquire a shared lock on the
table and the previous commands already held an exclusive lock on the
table. I am not sure if I am missing something but like you, it is not
clear to me as well what this test intends to do, so surely more
commentary is required.

>
> QUESTION
> Line 163 - How is this testing a SAVEPOINT? Or is it only to check
> that the SAVEPOINT command is not part of the replicated changes?
>
> ;

It is more of testing that subtransactions will not create a problem
while decoding.

>
> COMMENT
> Line 175 - Missing underscore in comment. Code requires also underscore:
> "nodecode" --> "_nodecode"
>

makes sense.

> ==========
> Patch V6-0001, File: contrib/test_decoding/test_decoding.c
> ==========
>
> COMMENT
> Line 43
> @@ -36,6 +40,7 @@ typedef struct
> bool skip_empty_xacts;
> bool xact_wrote_changes;
> bool only_local;
> + TransactionId check_xid; /* track abort of this txid */
> } TestDecodingData;
>
> The "check_xid" seems a meaningless name. Check what?
> IIUC maybe should be something like "check_xid_aborted"
>
> ;
>
> COMMENT
> Line 105
> @ -88,6 +93,19 @@ static void
> pg_decode_stream_truncate(LogicalDecodingContext *ctx,
> ReorderBufferTXN *txn,
> int nrelations, Relation relations[],
> ReorderBufferChange *change);
> +static bool pg_decode_filter_prepare(LogicalDecodingContext *ctx,
> + ReorderBufferTXN *txn,
>
> Remove extra blank line after these functions
>
> ;

The above two sounds reasonable suggestions.

>
> COMMENT
> Line 149
> @@ -116,6 +134,11 @@ _PG_output_plugin_init(OutputPluginCallbacks *cb)
> cb->stream_change_cb = pg_decode_stream_change;
> cb->stream_message_cb = pg_decode_stream_message;
> cb->stream_truncate_cb = pg_decode_stream_truncate;
> + cb->filter_prepare_cb = pg_decode_filter_prepare;
> + cb->prepare_cb = pg_decode_prepare_txn;
> + cb->commit_prepared_cb = pg_decode_commit_prepared_txn;
> + cb->abort_prepared_cb = pg_decode_abort_prepared_txn;
> +
> }
>
> There is a confusing mix of terminology where sometimes things are
> referred as ROLLBACK/rollback and other times apparently the same
> operation is referred as ABORT/abort. I do not know the root cause of
> this mixture. IIUC maybe the internal functions and protocol generally
> use the term "abort", whereas the SQL syntax is "ROLLBACK"... but
> where those two terms collide in the middle it gets quite confusing.
>
> At least I thought the names of the "callbacks" which get exposed to
> the user (e.g. in the help) might be better if they would match the
> SQL.
> "abort_prepared_cb" --> "rollback_prepared_db"
>

This suggestion sounds reasonable. I think it is to entertain the case
where due to error we need to rollback the transaction. I think it is
better if use 'rollback' terminology in the exposed functions. We
already have a function with the name stream_abort_cb in the code
which we also might want to rename but that is a separate thing and we
can deal it with a separate patch.

> There are similar review comments like this below where the
> alternating terms caused me some confusion.
>
> ~
>
> Also, Remove the extra blank line before the end of the function.
>
> ;
>
> COMMENT
> Line 267
> @ -227,6 +252,42 @@ pg_decode_startup(LogicalDecodingContext *ctx,
> OutputPluginOptions *opt,
> errmsg("could not parse value \"%s\" for parameter \"%s\"",
> strVal(elem->arg), elem->defname)));
> }
> + else if (strcmp(elem->defname, "two-phase-commit") == 0)
> + {
> + if (elem->arg == NULL)
> + continue;
>
> IMO the "check-xid" code might be better rearranged so the NULL check
> is first instead of if/else.
> e.g.
> if (elem->arg == NULL)
> ereport(FATAL,
> (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> errmsg("check-xid needs an input value")));
> ~
>
> Also, is it really supposed to be FATAL instead or ERROR. That is not
> the same as the other surrounding code.
>
> ;

+1.

>
> COMMENT
> Line 296
> if (data->check_xid <= 0)
> ereport(ERROR,
> (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> errmsg("Specify positive value for parameter \"%s\","
> " you specified \"%s\"",
> elem->defname, strVal(elem->arg))));
>
> The code checking for <= 0 seems over-complicated. Because conversion
> was using strtoul() I fail to see how this can ever be < 0. Wouldn't
> it be easier to simply test the result of the strtoul() function?
>
> BEFORE: if (errno == EINVAL || errno == ERANGE)
> AFTER: if (data->check_xid == 0)
>

Better to use TransactionIdIsValid(data->check_xid) here.

> ~
>
> Also, should this be FATAL? Everything else similar is ERROR.
>
> ;

It should be an error.

>
> COMMENT
> (general)
> I don't recall seeing any of these decoding options (e.g.
> "two-phase-commit", "check-xid") documented anywhere.
> So how can a user even know these options exist so they can use them?
> Perhaps options should be described on this page?
> https://www.postgresql.org/docs/13/functions-admin.html#FUNCTIONS-REPLICATION
>
> ;

I think we should do what we are doing for other options, if they are
not documented then why to document this one separately. I guess we
can make a case to document all the existing options and write a
separate patch for that.

>
> COMMENT
> (general)
> "check-xid" is a meaningless option name. Maybe something like
> "checked-xid-aborted" is more useful?
> Suggest changing the member, the option, and the error messages to
> match some better name.
>
> ;
>
> COMMENT
> Line 314
> @@ -238,6 +299,7 @@ pg_decode_startup(LogicalDecodingContext *ctx,
> OutputPluginOptions *opt,
> }
>
> ctx->streaming &= enable_streaming;
> + ctx->enable_twophase &= enable_2pc;
> }
>
> The "ctx->enable_twophase" is inconsistent naming with the
> "ctx->streaming" member.
> "enable_twophase" --> "twophase"
>
> ;

+1.

>
> COMMENT
> Line 374
> @@ -297,6 +359,94 @@ pg_decode_commit_txn(LogicalDecodingContext *ctx,
> ReorderBufferTXN *txn,
> OutputPluginWrite(ctx, true);
> }
>
> +
> +/*
> + * Filter out two-phase transactions.
> + *
> + * Each plugin can implement its own filtering logic. Here
> + * we demonstrate a simple logic by checking the GID. If the
> + * GID contains the "_nodecode" substring, then we filter
> + * it out.
> + */
> +static bool
> +pg_decode_filter_prepare(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
>
> Remove the extra preceding blank line.
>
> ~
>
> I did not find anything in the help about "_nodecode". Should it be
> there or is this deliberately not documented feature?
>
> ;

I guess we can document it along with filter_prepare API, if not
already documented.

>
> QUESTION
> Line 440
> +pg_decode_abort_prepared_txn(LogicalDecodingContext *ctx,
> ReorderBufferTXN *txn,
>
> Is this a wrong comment
> "ABORT PREPARED" --> "ROLLBACK PREPARED" ??
>
> ;
>
> COMMENT
> Line 620
> @@ -455,6 +605,22 @@ pg_decode_change(LogicalDecodingContext *ctx,
> ReorderBufferTXN *txn,
> }
> data->xact_wrote_changes = true;
>
> + /* if check_xid is specified */
> + if (TransactionIdIsValid(data->check_xid))
> + {
> + elog(LOG, "waiting for %u to abort", data->check_xid);
> + while (TransactionIdIsInProgress(dat
>
> The check_xid seems a meaningless name, and the comment "/* if
> check_xid is specified */" was not helpful either.
> IIUC purpose of this is to check that the nominated xid always is rolled back.
> So the appropriate name may be more like "check-xid-aborted".
>
> ;

Yeah, this part deserves better comments.

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andy Fan 2020-10-07 05:50:42 Re: Improve choose_custom_plan for initial partition prune case
Previous Message Thomas Munro 2020-10-07 05:17:27 Re: Two fsync related performance issues?