Re: Perform streaming logical transactions by background workers and parallel apply

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>
Cc: Peter Smith <smithpb2250(at)gmail(dot)com>, "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Perform streaming logical transactions by background workers and parallel apply
Date: 2022-09-22 10:11:33
Message-ID: CAA4eK1KEf6w1azJytYnLhd2Lo-=rPkRh1Jv2ePcvs=SKqNQnAQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Sep 22, 2022 at 8:59 AM wangw(dot)fnst(at)fujitsu(dot)com
<wangw(dot)fnst(at)fujitsu(dot)com> wrote:
>

Few comments on v33-0001
=======================
1.
+ else if (data->streaming == SUBSTREAM_PARALLEL &&
+ data->protocol_version < LOGICALREP_PROTO_STREAM_PARALLEL_VERSION_NUM)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("requested proto_version=%d does not support
streaming=parallel mode, need %d or higher",
+ data->protocol_version, LOGICALREP_PROTO_STREAM_PARALLEL_VERSION_NUM)));

I think we can improve this error message as: "requested
proto_version=%d does not support parallel streaming mode, need %d or
higher".

2.
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -3184,7 +3184,7 @@ SELECT pid, wait_event_type, wait_event FROM
pg_stat_activity WHERE wait_event i
</para>
<para>
OID of the relation that the worker is synchronizing; null for the
- main apply worker
+ main apply worker and the apply parallel worker
</para></entry>
</row>

This and other changes in monitoring.sgml refers the workers as "apply
parallel worker". Isn't it better to use parallel apply worker as we
are using at other places in the patch? But, I have another question,
do we really need to display entries for parallel apply workers in
pg_stat_subscription if it doesn't have any meaningful information? I
think we can easily avoid it in pg_stat_get_subscription by checking
apply_leader_pid.

3.
ApplyWorkerMain()
{
...
...
+
+ if (server_version >= 160000 &&
+ MySubscription->stream == SUBSTREAM_PARALLEL)
+ options.proto.logical.streaming = pstrdup("parallel");

After deciding here whether the parallel streaming mode is enabled or
not, we recheck the same thing in apply_handle_stream_abort() and
parallel_apply_can_start(). In parallel_apply_can_start(), we do it
via two different checks. How about storing this information say in
structure MyLogicalRepWorker in ApplyWorkerMain() and then use it at
other places?

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2022-09-22 10:42:58 cannot to compile master in llvm_resolve_symbols
Previous Message Fujii Masao 2022-09-22 09:47:50 Re: Refactor backup related code (was: Is it correct to say, "invalid data in file \"%s\"", BACKUP_LABEL_FILE in do_pg_backup_stop?)