Re: Perform streaming logical transactions by background workers and parallel apply

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Perform streaming logical transactions by background workers and parallel apply
Date: 2022-08-09 12:09:28
Message-ID: CAA4eK1+6=goeEw_wONRG5QuFcKpTL2u2n6BDhU-bwe0N_QY9Lg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Aug 9, 2022 at 11:09 AM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
>
> Some more comments
>
> + /*
> + * Exit if any relation is not in the READY state and if any worker is
> + * handling the streaming transaction at the same time. Because for
> + * streaming transactions that is being applied in apply background
> + * worker, we cannot decide whether to apply the change for a relation
> + * that is not in the READY state (see should_apply_changes_for_rel) as we
> + * won't know remote_final_lsn by that time.
> + */
> + if (list_length(ApplyBgworkersFreeList) !=
> list_length(ApplyBgworkersList) &&
> + !AllTablesyncsReady())
> + {
> + ereport(LOG,
> + (errmsg("logical replication apply workers for
> subscription \"%s\" will restart",
> + MySubscription->name),
> + errdetail("Cannot handle streamed replication
> transaction by apply "
> + "background workers until all tables are
> synchronized")));
> +
> + proc_exit(0);
> + }
>
> How this situation can occur? I mean while starting a background
> worker itself we can check whether all tables are sync ready or not
> right?
>

We are already checking at the start in apply_bgworker_can_start() but
I think it is required to check at the later point of time as well
because the new rels can be added to pg_subscription_rel via Alter
Subscription ... Refresh. I feel if that reasoning is correct then we
can probably expand comments to make it clear.

> + /* Check the status of apply background worker if any. */
> + apply_bgworker_check_status();
> +
>
> What is the need to checking each worker status on every commit? I
> mean if there are a lot of small transactions along with some
> steamiing transactions
> then it will affect the apply performance for those small transactions?
>

I don't think performance will be a concern because this won't do any
costly operation unless invalidation happens in which case it will
access system catalogs. However, if my above understanding is correct
that new tables can be added during the apply process then not sure
doing it at commit time is sufficient/correct because it can change
even during the transaction.

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Justin Pryzby 2022-08-09 12:26:22 Re: support for MERGE
Previous Message Etsuro Fujita 2022-08-09 11:44:59 Re: Fast COPY FROM based on batch insert