RE: Time delayed LR (WAS Re: logical replication restrictions)

From: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>
To: 'Amit Kapila' <amit(dot)kapila16(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>
Cc: Euler Taveira <euler(at)eulerto(dot)com>, "Takamichi Osumi (Fujitsu)" <osumi(dot)takamichi(at)fujitsu(dot)com>, Melih Mutlu <m(dot)melihmutlu(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Marcos Pegoraro <marcos(at)f10(dot)com(dot)br>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Peter Smith <smithpb2250(at)gmail(dot)com>
Subject: RE: Time delayed LR (WAS Re: logical replication restrictions)
Date: 2022-11-14 08:58:10
Message-ID: TYAPR01MB5866E747C3C5C1D99F88D179F5059@TYAPR01MB5866.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Dear Amit,

> I don't understand the reason for the below change in the patch:
>
> + /*
> + * If this subscription has been disabled and it has an apply
> + * delay set, wake up the logical replication worker to finish
> + * it as soon as possible.
> + */
> + if (!opts.enabled && sub->applydelay > 0)
> + logicalrep_worker_wakeup(sub->oid, InvalidOid);
> +
>
> It seems to me Kuroda-San has proposed this change [1] to fix the test
> but it is not clear to me why such a change is required. Why can't
> CHECK_FOR_INTERRUPTS() after waiting, followed by the existing below
> code [2] in LogicalRepApplyLoop() sufficient to handle parameter
> updates?
>
> [2]
> if (!in_remote_transaction && !in_streamed_transaction)
> {
> /*
> * If we didn't get any transactions for a while there might be
> * unconsumed invalidation messages in the queue, consume them
> * now.
> */
> AcceptInvalidationMessages();
> maybe_reread_subscription();
> ...

I mentioned the case with a long min_apply_delay configuration.

The worker will exit normally if apply_delay() has been ended and then it can reach
LogicalRepApplyLoop(). It works well if the delay is short and workers can wake up
immediately. But if workers have long min_apply_delay, they cannot go out the
while-loop, so worker processes remain for a long time. According to test code,
it is determined that worker should die immediately and we have a
test-case that we try to kill the worker with min_apply_delay = 1 day.

Also note that the launcher process will not set a latch or send a SIGTERM even
if the subscription is altered to enabled=f. In the launcher main loop, the
launcher reads pg_subscription periodically but they do not consider about changes
of parameters. They just skip doing something if they find disabled subscriptions.

If the situation can be ignored, we may be able to remove lines.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Geier 2022-11-14 09:19:18 Re: Optimize join selectivity estimation by not reading MCV stats for unique join attributes
Previous Message Masahiko Sawada 2022-11-14 08:43:40 Re: [PoC] Improve dead tuple storage for lazy vacuum