RE: Perform streaming logical transactions by background workers and parallel apply

From: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>
To: Peter Smith <smithpb2250(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Subject: RE: Perform streaming logical transactions by background workers and parallel apply
Date: 2023-01-25 14:24:50
Message-ID: OS0PR01MB57164C52100F6B68D2E3874194CE9@OS0PR01MB5716.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wednesday, January 25, 2023 7:30 AM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
>
> Here are my review comments for patch v87-0002.

Thanks for your comments.

> ======
> doc/src/sgml/config.sgml
>
> 1.
> <para>
> - Allows streaming or serializing changes immediately in
> logical decoding.
> The allowed values of
> <varname>logical_replication_mode</varname> are
> - <literal>buffered</literal> and <literal>immediate</literal>. When
> set
> - to <literal>immediate</literal>, stream each change if
> + <literal>buffered</literal> and <literal>immediate</literal>.
> The default
> + is <literal>buffered</literal>.
> + </para>
>
> I didn't think it was necessary to say “of logical_replication_mode”.
> IMO that much is already obvious because this is the first sentence of the
> description for logical_replication_mode.
>

Changed.

> ~~~
>
> 2.
> + <para>
> + On the publisher side, it allows streaming or serializing changes
> + immediately in logical decoding. When set to
> + <literal>immediate</literal>, stream each change if
> <literal>streaming</literal> option (see optional parameters set by
> <link linkend="sql-createsubscription"><command>CREATE
> SUBSCRIPTION</command></link>)
> is enabled, otherwise, serialize each change. When set to
> - <literal>buffered</literal>, which is the default, decoding will stream
> - or serialize changes when
> <varname>logical_decoding_work_mem</varname>
> - is reached.
> + <literal>buffered</literal>, decoding will stream or serialize changes
> + when <varname>logical_decoding_work_mem</varname> is
> reached.
> </para>
>
> 2a.
> "it allows" --> "logical_replication_mode allows"
>
> 2b.
> "decoding" --> "the decoding"

Changed.

> ~~~
>
> 3.
> + <para>
> + On the subscriber side, if <literal>streaming</literal> option is set
> + to <literal>parallel</literal>, this parameter also allows the leader
> + apply worker to send changes to the shared memory queue or to
> serialize
> + changes. When set to <literal>buffered</literal>, the leader sends
> + changes to parallel apply workers via shared memory queue. When
> set to
> + <literal>immediate</literal>, the leader serializes all changes to
> + files and notifies the parallel apply workers to read and apply them at
> + the end of the transaction.
> + </para>
>
> "this parameter also allows" --> "logical_replication_mode also allows"

Changed.

> ~~~
>
> 4.
> <para>
> This parameter is intended to be used to test logical decoding and
> replication of large transactions for which otherwise we need to
> generate the changes till
> <varname>logical_decoding_work_mem</varname>
> - is reached.
> + is reached. Moreover, this can also be used to test the transmission of
> + changes between the leader and parallel apply workers.
> </para>
>
> "Moreover, this can also" --> "It can also"
>
> I am wondering would this sentence be better put at the top of the GUC
> description. So then the first paragraph becomes like this:
>
>
> SUGGESTION (I've also added another sentence "The effect of...")
>
> The allowed values are buffered and immediate. The default is buffered. This
> parameter is intended to be used to test logical decoding and replication of large
> transactions for which otherwise we need to generate the changes till
> logical_decoding_work_mem is reached. It can also be used to test the
> transmission of changes between the leader and parallel apply workers. The
> effect of logical_replication_mode is different for the publisher and
> subscriber:
>
> On the publisher side...
>
> On the subscriber side...

I think your suggestion makes sense, so changed as suggested.

> ======
> .../replication/logical/applyparallelworker.c
>
> 5.
> + /*
> + * In immeidate mode, directly return false so that we can switch to
> + * PARTIAL_SERIALIZE mode and serialize remaining changes to files.
> + */
> + if (logical_replication_mode == LOGICAL_REP_MODE_IMMEDIATE) return
> + false;
>
> Typo "immediate"
>
> Also, I felt "directly" is not needed. "return false" and "directly return false" is the
> same.
>
> SUGGESTION
> Using ‘immediate’ mode returns false to cause a switch to PARTIAL_SERIALIZE
> mode so that the remaining changes will be serialized.

Changed.

> ======
> src/backend/utils/misc/guc_tables.c
>
> 6.
> {
> {"logical_replication_mode", PGC_USERSET, DEVELOPER_OPTIONS,
> - gettext_noop("Allows streaming or serializing each change in logical
> decoding."),
> - NULL,
> + gettext_noop("Controls the behavior of logical replication publisher
> and subscriber"),
> + gettext_noop("If set to immediate, on the publisher side, it "
> + "allows streaming or serializing each change in "
> + "logical decoding. On the subscriber side, in "
> + "parallel streaming mode, it allows the leader apply "
> + "worker to serialize changes to files and notifies "
> + "the parallel apply workers to read and apply them at "
> + "the end of the transaction."),
> GUC_NOT_IN_SAMPLE
> },
>
> 6a. short description
>
> User PoV behaviour should be the same. Instead, maybe say "controls the
> internal behavior" or something like that?

Changed to "internal behavior xxx"

> ~
>
> 6b. long description
>
> IMO the long description shouldn’t mention ‘immediate’ mode first as it does.
>
> BEFORE
> If set to immediate, on the publisher side, ...
>
> AFTER
> On the publisher side, ...

Changed.

Attach the new version patch set.
The 0001 patch is the same as the v88-0001 posted by Amit[1],
attach it here to make cfbot happy.

[1] https://www.postgresql.org/message-id/CAA4eK1JpWoaB63YULpQa1KDw_zBW-QoRMuNxuiP1KafPJzuVuw%40mail.gmail.com

Best Regards,
Hou zj

Attachment Content-Type Size
v88-0001-Rename-GUC-logical_decoding_mode-to-logical_repl.patch application/octet-stream 10.9 KB
v88-0002-Extend-the-logical_replication_mode-to-test-the-.patch application/octet-stream 14.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Takamichi Osumi (Fujitsu) 2023-01-25 14:27:56 RE: Time delayed LR (WAS Re: logical replication restrictions)
Previous Message Takamichi Osumi (Fujitsu) 2023-01-25 14:23:51 RE: Time delayed LR (WAS Re: logical replication restrictions)