RE: Perform streaming logical transactions by background workers and parallel apply

From: "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Peter Smith <smithpb2250(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: RE: Perform streaming logical transactions by background workers and parallel apply
Date: 2022-08-04 06:40:05
Message-ID: OS3PR01MB6275B1DAFAD2EE1A50407F669E9F9@OS3PR01MB6275.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jul 25, 2022 at 21:50 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> Few comments on 0001:
> ======================

Thanks for your comments.

> 1.
> - <structfield>substream</structfield> <type>bool</type>
> + <structfield>substream</structfield> <type>char</type>
> </para>
> <para>
> - If true, the subscription will allow streaming of in-progress
> - transactions
> + Controls how to handle the streaming of in-progress transactions:
> + <literal>f</literal> = disallow streaming of in-progress transactions,
> + <literal>t</literal> = spill the changes of in-progress transactions to
> + disk and apply at once after the transaction is committed on the
> + publisher,
> + <literal>p</literal> = apply changes directly using a background worker
>
> Shouldn't the description of 'p' be something like: apply changes
> directly using a background worker, if available, otherwise, it
> behaves the same as 't'

Improved as suggested.

> 2.
> Note that if an error happens when
> + applying changes in a background worker, the finish LSN of the
> + remote transaction might not be reported in the server log.
>
> Is there any case where finish LSN can be reported when applying via
> background worker, if not, then we should use 'won't' instead of
> 'might not'?

Yes, I think the use case you mentioned exists. (The finish LSN can be reported
when applying via background worker). So I do not change this.
For example, in the function apply_handle_stream_commit , if an error occurs
after invoking the function set_apply_error_context_xact, I think the error
message will contain the finish LSN.

> 3.
> +#define PG_LOGICAL_APPLY_SHM_MAGIC 0x79fb2447 // TODO Consider
> change
>
> It is better to change this as the same magic number is used by
> PG_TEST_SHM_MQ_MAGIC

Improved as suggested. I changed it to a random magic number (0x787ca067) that
doesn't duplicate in the HEAD.

> 4.
> + /* Ignore statistics fields that have been updated. */
> + s.cursor += IGNORE_SIZE_IN_MESSAGE;
>
> Can we change the comment to: "Ignore statistics fields that have been
> updated by the main apply worker."? Will it be better to name the
> define as "SIZE_STATS_MESSAGE"?

Improved the comments and the macro name as suggested.

> 5.
> +/* Apply Background Worker main loop */
> +static void
> +LogicalApplyBgwLoop(shm_mq_handle *mqh, volatile ApplyBgworkerShared
> *shared)
> {
> ...
> ...
>
> + apply_dispatch(&s);
> +
> + if (ConfigReloadPending)
> + {
> + ConfigReloadPending = false;
> + ProcessConfigFile(PGC_SIGHUP);
> + }
> +
> + MemoryContextSwitchTo(oldctx);
> + MemoryContextReset(ApplyMessageContext);
>
> We should not process the config file under ApplyMessageContext. You
> should switch context before processing the config file. See other
> similar usages in the code.

Fixed as suggested.
In addition, the apply bgworker misses switching "CurrentMemoryContext" back to
oldctx when it receives a "STOP" message. This will make oldctx lose track of
"TopMemoryContext". Fixed this by invoking `MemoryContextSwitchTo(oldctx);`
when processing the "STOP" message.

> 6.
> +/* Apply Background Worker main loop */
> +static void
> +LogicalApplyBgwLoop(shm_mq_handle *mqh, volatile ApplyBgworkerShared
> *shared)
> {
> ...
> ...
> + MemoryContextSwitchTo(oldctx);
> + MemoryContextReset(ApplyMessageContext);
> + }
> +
> + MemoryContextSwitchTo(TopMemoryContext);
> + MemoryContextReset(ApplyContext);
> ...
> }
>
> I don't see the need to reset ApplyContext here as we don't do
> anything in that context here.

Improved as suggested.
Removed the invocation of function MemoryContextReset(ApplyContext).

The new patches were attached in [1].

[1] - https://www.postgresql.org/message-id/OS3PR01MB6275D64BE7726B0221B15F389E9F9%40OS3PR01MB6275.jpnprd01.prod.outlook.com

Regards,
Wang wei

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message John Naylor 2022-08-04 06:56:02 Re: support for SSE2 intrinsics
Previous Message wangw.fnst@fujitsu.com 2022-08-04 06:38:23 RE: Perform streaming logical transactions by background workers and parallel apply