Re: Perform streaming logical transactions by background workers and parallel apply

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Peter Smith <smithpb2250(at)gmail(dot)com>
Cc: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Perform streaming logical transactions by background workers and parallel apply
Date: 2022-10-13 05:03:53
Message-ID: CAA4eK1+OyQ8-psruZZ0sYff5KactTHZneR-cfsHd+n+N7khEKQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Oct 12, 2022 at 3:41 PM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
>
> Here are some review comments for v36-0001.
>
>
> 6. LogicalParallelApplyLoop
>
> + for (;;)
> + {
> + void *data;
> + Size len;
> + int c;
> + StringInfoData s;
> + MemoryContext oldctx;
> +
> + CHECK_FOR_INTERRUPTS();
> +
> + /* Ensure we are reading the data into our memory context. */
> + oldctx = MemoryContextSwitchTo(ApplyMessageContext);
> +
> ...
> +
> + MemoryContextSwitchTo(oldctx);
> + MemoryContextReset(ApplyMessageContext);
> + }
>
> Do those memory context switches need to happen inside the for(;;)
> loop like that? I thought perhaps those can be done *outside* of the
> loop instead of always switching and switching back on the next
> iteration.
>

I think we need to reset the ApplyMessageContext each time after
processing a message and also don't want to process the config file in
the applymessagecontext.

> ======
>
> src/backend/replication/logical/launcher.c
>
> 12. logicalrep_worker_launch
>
> Previously I suggested may the apply process name should change
>
> FROM
> "logical replication worker for subscription %u"
> TO
> "logical replication apply worker for subscription %u"
>
> and Houz wrote ([1] #13)
> I am not sure if it's a good idea to change existing process description.
>
> ~
>
> But that seems inconsistent to me because elsewhere this patch is
> already exposing the name to the user (like when it says "logical
> replication apply worker for subscription \"%s\" has started".
> Shouldn’t the process name match these logs?
>

I think it is okay to change the name here for the sake of consistency.

>
> 19. ApplyWorkerMain
>
> +
> +/* Logical Replication Apply worker entry point */
> +void
> +ApplyWorkerMain(Datum main_arg)
>
> Previously I suugested changing "Apply worker" to "apply worker", and
> Houzj ([1] #48) replied:
> Since it's the existing comment, I feel we can leave this.
>
> ~
>
> Normally I agree don't change the original code unrelated to the
> patch, but in practice, I think no patch would be accepted that just
> changes just "A" to "a", so if you don't change it here in this patch
> to be consistent then it will never happen. That's why I think should
> be part of this patch.
>

Hmm, I think one might then extend this to many other similar cosmetic
stuff in the nearby areas. It sometimes distracts the reviewer if
there are unrelated changes, so better to avoid it.

>
> 22. get_transaction_apply_action
>
> +static TransApplyAction
> +get_transaction_apply_action(TransactionId xid,
> ParallelApplyWorkerInfo **winfo)
> +{
> + *winfo = NULL;
> +
> + if (am_parallel_apply_worker())
> + {
> + return TRANS_PARALLEL_APPLY;
> + }
> + else if (in_remote_transaction)
> + {
> + return TRANS_LEADER_APPLY;
> + }
> +
> + /*
> + * Check if we are processing this transaction using a parallel apply
> + * worker and if so, send the changes to that worker.
> + */
> + else if ((*winfo = parallel_apply_find_worker(xid)))
> + {
> + return TRANS_LEADER_SEND_TO_PARALLEL;
> + }
> + else
> + {
> + return TRANS_LEADER_SERIALIZE;
> + }
> +}
>
> 22a.
>
> Previously I suggested the statement blocks are overkill and all the
> {} should be removed, and Houzj ([1] #52a) wrote:
> I feel this style is fine.
>
> ~
>
> Sure, it is fine, but FWIW I thought it is not the normal PG coding
> convention to use unnecessary {} unless it would seem strange to omit
> them.
>

Yeah, but here we are using comments in between the else if construct
due to which using {} makes it look better. I agree that this is
mostly a question of personal preference and we can go either way but
my preference would be to use the style patch has currently used.

>
> 23. src/test/regress/sql/subscription.sql
>
> Previously I mentioned testing the 'streaming' option with no value.
> Houzj replied ([1]
> I didn't find similar tests for no value explicitly specified cases,
> so I didn't add this for now.
>
> But as I also responded ([4] #58) already to Amit:
> IMO this one is a bit different because it's not really a boolean
> option anymore - it's a kind of a hybrid boolean/enum. That's why I
> thought this ought to be tested regardless if there are existing tests
> for the (normal) boolean options.
>

I still feel this is not required. I think we have to be cautious
about not adding too many tests in this area that are of less or no
value.

--
With Regards,
Amit Kapila.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2022-10-13 05:07:40 Re: create subscription - improved warning message
Previous Message Andrey Lepikhov 2022-10-13 04:38:05 Re: Fast COPY FROM based on batch insert