Re: [HACKERS] logical decoding of two-phase transactions

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Peter Smith <smithpb2250(at)gmail(dot)com>
Cc: vignesh C <vignesh21(at)gmail(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] logical decoding of two-phase transactions
Date: 2021-05-18 11:31:47
Message-ID: CAA4eK1Jz64rwLyB6H7Z_SmEDouJ41KN42=VkVFp6JTpafJFG8Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, May 13, 2021 at 3:20 PM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
>
> Please find attached the latest patch set v75*
>

Review comments for v75-0001-Add-support-for-prepared-transactions-to-built-i:
===============================================================================
1.
- <term><literal>CREATE_REPLICATION_SLOT</literal> <replaceable
class="parameter">slot_name</replaceable> [
<literal>TEMPORARY</literal> ] { <literal>PHYSICAL</literal> [
<literal>RESERVE_WAL</literal> ] | <literal>LOGICAL</literal>
<replaceable class="parameter">output_plugin</replaceable> [
<literal>EXPORT_SNAPSHOT</literal> |
<literal>NOEXPORT_SNAPSHOT</literal> | <literal>USE_SNAPSHOT</literal>
] }
+ <term><literal>CREATE_REPLICATION_SLOT</literal> <replaceable
class="parameter">slot_name</replaceable> [
<literal>TEMPORARY</literal> ] [ <literal>TWO_PHASE</literal> ] {
<literal>PHYSICAL</literal> [ <literal>RESERVE_WAL</literal> ] |
<literal>LOGICAL</literal> <replaceable
class="parameter">output_plugin</replaceable> [
<literal>EXPORT_SNAPSHOT</literal> |
<literal>NOEXPORT_SNAPSHOT</literal> | <literal>USE_SNAPSHOT</literal>
] }

Can we do some testing of the code related to this in some way? One
random idea could be to change the current subscriber-side code just
for testing purposes to see if this works. Can we enhance and use
pg_recvlogical to test this? It is possible that if you address
comment number 13 below, this can be tested with Create Subscription
command.

2.
- belong to the same transaction. It also sends changes of large in-progress
- transactions between a pair of Stream Start and Stream Stop messages. The
- last stream of such a transaction contains Stream Commit or Stream Abort
- message.
+ belong to the same transaction. Similarly, all messages between a pair of
+ Begin Prepare and Commit Prepared messages belong to the same transaction.

I think here we need to write Prepare instead of Commit Prepared
because Commit Prepared for a transaction can come at a later point of
time and all the messages in-between won't belong to the same
transaction.

3.
+<!-- ==================== TWO_PHASE Messages ==================== -->
+
+<para>
+The following messages (Begin Prepare, Prepare, Commit Prepared,
Rollback Prepared)
+are available since protocol version 3.
+</para>

I am not sure here marker like "TWO_PHASE Messages" is required. We
don't have any such marker for streaming messages.

4.
+<varlistentry>
+<term>Int64</term>
+<listitem><para>
+ Timestamp of the prepare transaction.

Isn't it better to write this description as "Prepare timestamp of the
transaction" to match with the similar description of Commit
timestamp. Also, there are similar occurances in the patch at other
places, change those as well.

5.
+<term>Begin Prepare</term>
+<listitem>
+<para>
...
+<varlistentry>
+<term>Int32</term>
+<listitem><para>
+ Xid of the subtransaction (will be same as xid of the
transaction for top-level
+ transactions).

The above description seems wrong to me. It should be Xid of the
transaction as we won't receive Xid of subtransaction in Begin
message. The same applies to the prepare/commit prepared/rollback
prepared transaction messages as well, so change that as well
accordingly.

6.
+<term>Byte1('P')</term>
+<listitem><para>
+ Identifies this message as a two-phase prepare
transaction message.
+</para></listitem>

In all the similar messages, we are using "Identifies the message as
...". I feel it is better to be consistent in this and similar
messages in the patch.

7.
+<varlistentry>
+
+<term>Rollback Prepared</term>
+<listitem>
..
+<varlistentry>
+<term>Int64</term>
+<listitem><para>
+ The LSN of the prepare.
+</para></listitem>

This should be end LSN of the prepared transaction.

8.
+bool
+LookupGXact(const char *gid, XLogRecPtr prepare_end_lsn,
+ TimestampTz origin_prepare_timestamp)
..
..
+ /*
+ * We are neither expecting the collisions of GXACTs (same gid)
+ * between publisher and subscribers nor the apply worker restarts
+ * after prepared xacts,

The second part of the comment ".. nor the apply worker restarts after
prepared xacts .." is no longer true after commit 8bdb1332eb[1]. So,
we can remove it.

9.
+ /*
+ * Does the subscription have tables?
+ *
+ * If there were not-READY relations found then we know it does. But if
+ * table_state_no_ready was empty we still need to check again to see
+ * if there are 0 tables.
+ */
+ has_subrels = (list_length(table_states_not_ready) > 0) ||

Typo in comments. /table_state_no_ready/table_state_not_ready

10.
+ if (!twophase)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("unrecognized subscription parameter: \"%s\"", defel->defname)));

errmsg is not aligned properly. Can we make the error message clear,
something like: "cannot change two_phase option"

11.
@@ -69,7 +69,8 @@ parse_subscription_options(List *options,
char **synchronous_commit,
bool *refresh,
bool *binary_given, bool *binary,
- bool *streaming_given, bool *streaming)
+ bool *streaming_given, bool *streaming,
+ bool *twophase_given, bool *twophase)

This function already has 14 parameters and this patch adds 2 new
ones. Isn't it better to have a struct (ParseSubOptions) for these
parameters? I think that might lead to some code churn but we can have
that as a separate patch on top of which we can create two_pc patch.

12.
* The subscription two_phase commit implementation requires
+ * that replication has passed the initial table
+ * synchronization phase before the two_phase becomes properly
+ * enabled.

Can we slightly modify the starting of this sentence as:"The
subscription option 'two_phase' requires that ..."

13.
@@ -507,7 +558,16 @@ CreateSubscription(CreateSubscriptionStmt *stmt,
bool isTopLevel)
{
Assert(slotname);

- walrcv_create_slot(wrconn, slotname, false,
+ /*
+ * Even if two_phase is set, don't create the slot with
+ * two-phase enabled. Will enable it once all the tables are
+ * synced and ready. This avoids race-conditions like prepared
+ * transactions being skipped due to changes not being applied
+ * due to checks in should_apply_changes_for_rel() when
+ * tablesync for the corresponding tables are in progress. See
+ * comments atop worker.c.
+ */
+ walrcv_create_slot(wrconn, slotname, false, false,

Can't we enable two_phase if copy_data is false? Because in that case,
all relations will be in a READY state. If we do that then we should
also set two_phase state as 'enabled' during createsubscription. I
think we need to be careful to check that connect option is given and
copy_data is false before setting such a state. Now, I guess we may
not be able to optimize this to not set 'enabled' state when the
subscription has no rels.

14.
+ if (options->proto.logical.twophase &&
+ PQserverVersion(conn->streamConn) >= 140000)
+ appendStringInfoString(&cmd, ", two_phase 'on'");
+

We need to check 150000 here but for now, maybe we can add a comment
similar to what you have added in ApplyWorkerMain to avoid forgetting
this change. Probably a similar comment is required pg_dump.c.

15.
@@ -49,7 +49,7 @@ logicalrep_write_begin(StringInfo out, ReorderBufferTXN *txn)

/* fixed fields */
pq_sendint64(out, txn->final_lsn);
- pq_sendint64(out, txn->commit_time);
+ pq_sendint64(out, txn->u_op_time.prepare_time);
pq_sendint32(out, txn->xid);

Why here prepare_time? It should be commit_time. We use prepare_time
in begin_prepare not in begin.

16.
+logicalrep_write_commit_prepared(StringInfo out, ReorderBufferTXN *txn,
+ XLogRecPtr commit_lsn)
+{
+ uint8 flags = 0;
+
+ pq_sendbyte(out, LOGICAL_REP_MSG_COMMIT_PREPARED);
+
+ /*
+ * This should only ever happen for two-phase commit transactions. In
+ * which case we expect to have a valid GID. Additionally, the transaction
+ * must be prepared. See ReorderBufferFinishPrepared.
+ */
+ Assert(txn->gid != NULL);
+

The second part of the comment ("Additionally, the transaction must be
prepared) is no longer true. Also, we can combine the first two
sentences here and at other places where a similar comment is used.

17.
+ union
+ {
+ TimestampTz commit_time;
+ TimestampTz prepare_time;
+ } u_op_time;

I think it is better to name this union as xact_time or trans_time.

[1] - https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=8bdb1332eb51837c15a10a972c179b84f654279e

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message houzj.fnst@fujitsu.com 2021-05-18 11:41:02 Re: Parallel scan with SubTransGetTopmostTransaction assert coredump
Previous Message Drouvot, Bertrand 2021-05-18 11:26:38 Re: [UNVERIFIED SENDER] Re: pg_upgrade can result in early wraparound on databases with high transaction load