RE: Perform streaming logical transactions by background workers and parallel apply

From: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>
To: "kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com>, "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: RE: Perform streaming logical transactions by background workers and parallel apply
Date: 2022-08-24 13:50:43
Message-ID: OS0PR01MB5716DEF013064BCAAA5054A194739@OS0PR01MB5716.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Aug 22, 2022 20:50 PM Kuroda, Hayato/黒田 隼人 <kuroda(dot)hayato(at)fujitsu(dot)com> wrote:
> Dear Wang,
>
> Thank you for updating the patch! Followings are comments about
> v23-0001 and v23-0005.

Thanks for your comments.

> v23-0001
>
> 01. logical-replication.sgml
>
> + <para>
> + When the streaming mode is <literal>parallel</literal>, the finish LSN of
> + failed transactions may not be logged. In that case, it may be necessary to
> + change the streaming mode to <literal>on</literal> and cause the same
> + conflicts again so the finish LSN of the failed transaction will be written
> + to the server log. For the usage of finish LSN, please refer to <link
> + linkend="sql-altersubscription"><command>ALTER SUBSCRIPTION ...
> + SKIP</command></link>.
> + </para>
>
> I was not sure about streaming='off' mode. Is there any reasons that
> only ON mode is focused?

Added off.

> 02. protocol.sgml
>
> + <varlistentry>
> + <term>Int64 (XLogRecPtr)</term>
> + <listitem>
> + <para>
> + The LSN of the abort. This field is available since protocol version
> + 4.
> + </para>
> + </listitem>
> + </varlistentry>
> +
> + <varlistentry>
> + <term>Int64 (TimestampTz)</term>
> + <listitem>
> + <para>
> + Abort timestamp of the transaction. The value is in number
> + of microseconds since PostgreSQL epoch (2000-01-01). This field is
> + available since protocol version 4.
> + </para>
> + </listitem>
> + </varlistentry>
> +
>
> It seems that changes are in the variablelist for stream commit.
> I think these are included in the stream abort message, so it should be moved.

Fixed.

> 03. decode.c
>
> - ReorderBufferForget(ctx->reorder, parsed->subxacts[i], buf-
> >origptr);
> + ReorderBufferForget(ctx->reorder,
> + parsed->subxacts[i], buf-
> >origptr,
> +
> + commit_time);
> }
> - ReorderBufferForget(ctx->reorder, xid, buf->origptr);
> + ReorderBufferForget(ctx->reorder, xid, buf->origptr,
> + commit_time);
>
> 'commit_time' has been passed as argument 'abort_time', I think it may
> be confusing.
> How about adding a comment above, like:
> "In case of streamed transactions, they are regarded as being aborted
> at commit_time"

IIRC, I free the comment above the loop might be more clear about this,
but I will think about it again.

> 04. launcher.c
>
> 04.a
>
> + worker->main_worker_pid = is_subworker ? MyProcPid : 0;
>
> You can use InvalidPid instead of 0.
> (I thought pid should be represented by the datatype pid_t, but in
> some codes it is defined as int...)
>
> 04.b
>
> + worker->main_worker_pid = 0;
>
> You can use InvalidPid instead of 0, same as above.

Improved

> 05. origin.c
>
> void
> -replorigin_session_setup(RepOriginId node)
> +replorigin_session_setup(RepOriginId node, int acquired_by)
>
> IIUC the same slot can be used only when the apply main worker has
> already acquired the slot and the subworker for the same subscription
> tries to acquire, but it cannot understand from comments.
> How about adding comments, or an assertion that acquired_by is same as
> session_replication_state->acquired_by ?
> Moreover acquired_by should be compared with InvalidPid, based on
> above comments.

I think we have tried to check if 'acquired_by' and acquired_by of
slot are equal inside this function.

I am not sure if it's a good idea to use InvalidPid here ,as we set
session_replication_state->acquired_by(int) to 0(instead of -1) to indicate
that no worker acquire it.

> 06. proto.c
>
> void
> logicalrep_write_stream_abort(StringInfo out, TransactionId xid,
> - TransactionId subxid)
> + ReorderBufferTXN *txn, XLogRecPtr abort_lsn,
> + bool
> + write_abort_lsn
>
> I think write_abort_lsn may be not needed, because abort_lsn can be
> used for controlling whether abort_XXX fields should be filled or not.

I think if the subscriber's version is lower than 16 (which won't handle the abort_XXX fields),
then we don't need to send the abort_XXX fields either.

> 07. worker.c
>
> +/*
> + * The number of changes during one streaming block (only for apply
> background
> + * workers)
> + */
> +static uint32 nchanges = 0;
>
> This variable is used only by the main apply worker, so the comment
> seems not correct.
> How about "...(only for SUBSTREAM_PARALLEL case)"?

The previous comments seemed a bit confusing. I tried to improve this comments to this:
```
The number of changes sent to apply background workers during one streaming block.
```

> v23-0005
>
> 08. monitoring.sgml
>
> I cannot decide which option proposed in [1] is better, but followings
> descriptions are needed in both cases.
> (In [2] I had intended to propose something like option 2)
>
> 08.a
>
> You can add a description that the field 'relid' will be NULL even for
> apply background worker.
>
> 08.b
>
> You can add a description that fields 'received_lsn',
> 'last_msg_send_time', 'last_msg_receipt_time', 'latest_end_lsn',
> 'latest_end_time' will be NULL for apply background worker.

Improved

Regards,
Wang wei

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message vignesh C 2022-08-24 13:57:14 Re: Handle infinite recursion in logical replication setup
Previous Message houzj.fnst@fujitsu.com 2022-08-24 13:47:15 RE: Perform streaming logical transactions by background workers and parallel apply