Re: [HACKERS] logical decoding of two-phase transactions

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Ajin Cherian <itsajin(at)gmail(dot)com>
Cc: Peter Smith <smithpb2250(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] logical decoding of two-phase transactions
Date: 2021-03-17 05:57:30
Message-ID: CAA4eK1LNLA20ci3_qqNQv7BYRTy3HqiAsOfuieqo6tJ2GeYuJw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 16, 2021 at 5:03 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Mon, Mar 15, 2021 at 6:14 PM Ajin Cherian <itsajin(at)gmail(dot)com> wrote:
> >
> > Here's a new patch-set that implements this new solution proposed by Amit.
> > Patchset-v60 implements:
> >
>
> I have reviewed the latest patch and below are my comments, some of
> these might overlap with Vignesh's as I haven't looked at his comments
> in detail.
> Review comments
> ================
>

Few more comments:
=================
1.
+ <structfield>subtwophase</structfield> <type>char</type>
+ </para>
+ <para>
+ The <varname>two_phase commit current state:</varname>
+ <itemizedlist>
+ <listitem><para><literal>'n'</literal> = two_phase mode was
not requested, so is disabled.</para></listitem>
+ <listitem><para><literal>'p'</literal> = two_phase mode was
requested, but is pending enablement.</para></listitem>
+ <listitem><para><literal>'y'</literal> = two_phase mode was
requested, and is enabled.</para></listitem>
+ </itemizedlist>
+ </para></entry>
+ </row>

Can we name the column as subtwophasestate? And then describe as we
are doing for srsubstate in pg_subscription_rel. Also, it might be
better to keep names as: 'd' disabled, 'p' pending twophase enablement
and 'e' twophase enabled.
<row>
<entry role="catalog_table_entry"><para role="column_definition">
<structfield>srsubstate</structfield> <type>char</type>
</para>
<para>
State code:
<literal>i</literal> = initialize,
<literal>d</literal> = data is being copied,
<literal>f</literal> = finished table copy,
<literal>s</literal> = synchronized,
<literal>r</literal> = ready (normal replication)
</para></entry>
</row>

2.
@@ -427,6 +428,10 @@ libpqrcv_startstreaming(WalReceiverConn *conn,
PQserverVersion(conn->streamConn) >= 140000)
appendStringInfoString(&cmd, ", streaming 'on'");

+ if (options->proto.logical.twophase &&
+ PQserverVersion(conn->streamConn) >= 140000)
+ appendStringInfoString(&cmd, ", two_phase 'on'");
+
pubnames = options->proto.logical.publication_names;
pubnames_str = stringlist_to_identifierstr(conn->streamConn, pubnames);
if (!pubnames_str)
@@ -453,6 +458,9 @@ libpqrcv_startstreaming(WalReceiverConn *conn,
appendStringInfo(&cmd, " TIMELINE %u",
options->proto.physical.startpointTLI);

+ if (options->logical && two_phase)
+ appendStringInfoString(&cmd, " TWO_PHASE");
+

Why are we sending two_phase 'on' and " TWO_PHASE" separately? I think
we don't need to introduce TWO_PHASE token in grammar, let's handle it
via plugin_options similar to what we do for 'streaming'. Also, a
similar change would be required for Create_Replication_Slot.

3.
+ /*
+ * Do not allow toggling of two_phase option, this could
+ * cause missing of transactions and lead to an inconsistent
+ * replica.
+ */
+ if (!twophase)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("cannot alter two_phase option")));
+

I think here you can either give reference of worker.c to explain how
this could lead to an inconsistent replica or expand the comments here
if the information is not present elsewhere.

4.
* support for streaming large transactions.
+ *
+ * LOGICALREP_PROTO_2PC_VERSION_NUM is the minimum protocol version with
+ * support for two-phase commit PREPARE decoding.
*/
#define LOGICALREP_PROTO_MIN_VERSION_NUM 1
#define LOGICALREP_PROTO_VERSION_NUM 1
#define LOGICALREP_PROTO_STREAM_VERSION_NUM 2
+#define LOGICALREP_PROTO_2PC_VERSION_NUM 2

I think it is better to name the new define as
LOGICALREP_PROTO_TWOPHASE_VERSION_NUM. Also mention in comments in
some way that we are keeping the same version number for stream and
two-phase defines because they got introduced in the same release
(14).

5. I have modified the comments atop worker.c to explain the design
and some of the problems clearly. See attached. If you are fine with
this, please include it in the next version of the patch.

--
With Regards,
Amit Kapila.

Attachment Content-Type Size
change_two_phase_desc_1.patch application/octet-stream 4.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2021-03-17 05:58:02 Re: Assertion failure with barriers in parallel hash join
Previous Message Andres Freund 2021-03-17 05:57:18 Re: Getting better results from valgrind leak tracking